Sense key unit attention & iostat hardware and transport errors on SAN disks
Hello, I'm trying to get to the bottom of SAN disk errors we've been seeing.
Server is Sun Fire X4270 M2 running Solaris 10 8/11 u10 X86 since April 2012. SAN HBAs are SG-PCIE2FC-QF8-Z-Sun-branded Qlogic. SAN storage system is Hitachi VSP. We have 32 LUNs in use and another 8 LUNs not brought into Symantec Storage Foundation yet.
We started seeing hardware and transport errors on the LUNs July 2 which lead to corruption of 3 Veritas filesystems. I got that resolved on the third and we had to restore from tape 3 filesystems. The SAN team found no SAN switch errors and Hitachi's analysis showed no disk errors.
We originally had Solaris MPxIO enabled by default for multipathing, along with Veritas DMP. Symantec was saying that the two multipathing systems could co-exist, but the errors returned so I disabled MPxIO and rebooted on July 17. I didn't see any more errors until yesterday at 1110am. Is this a problem with the SAN HBAs? What do these errors mean? Any help would be appreciated.
It could be the whole way of transport... From HBA to the transceiver over the cables to the other end of the system. it can also be a hardware error on the HBA itself. Also there can be a faulty device which may cause bus resets and therfore a rescan of the bus all the time.
The first thing I would try is to change two transceivers (from two HBAs; if you have) and see if the error is going with the transceiver or staying on the same controller.
I'm waiting for our Oracle support to be renewed, so until then it's hard to get something done with the existing hardware support team unless I can specifically point out some hardware is defective. This is the only Intel X4270 we have that's connected to SAN storage, most of our SAN-connected servers are SPARC. So I'm less familiar with this Intel hardware.
Hi Unix experts,
I have a question regarding a disk failure seen in "iostat -Enm" output:
# iostat -Enm
c1t0d0 Soft Errors: 0 Hard Errors: 7 Transport Errors: 9
Vendor: FUJITSU Product: MAU3073NCSUN72G Revision: 0802 Serial No: 0514F005M0
Size: 73.40GB <73400057856 bytes>
Media... (5 Replies)
Hello everybody,
I'm using the binary inqraid (Linux RHEL) in order to retrieve information about SAN disks. The questions are:
Given an LDEV, how do I know if the SAN disk related to this LDEV is being used by the OS? I mean, how can I demonstrate to "Storage department" that all disks of... (4 Replies)
I all,
I would like to know what are the causes of :
-soft error
-harderror
-transport error
and how to avoid and repare them.
I got the iostat out put below:
atng-mm01% iostat -En | grep -i hard
c0t0d0 Soft Errors: 1 Hard Errors: 0 Transport Errors: 0
c0t0d1 ... (3 Replies)
Good morning to one and all :-) Thank god its Friday, as its bee na rubbish week for me !
So, a quick question. Disks ! Ive got a few local disks, and a few SAN disks used on my solaris server. Whats confusing me, and Im not sure if there's an issue at the SAN end, or my end, regarding the... (3 Replies)
Hi all,
We have below WARNING in /var/adm/messages file from our Solaris server.
WARNING: /sbus@1f,0/SUNW,fas@e,8800000/sd@0,0 (sd0):
Error for Command: write(10) Error Level: Fatal
Requested Block: 16745265 Error Block: 16745269
Vendor: SEAGATE Serial Number:... (8 Replies)
Hi
I have a linux box attched to a SAN storage from EMC with RAID 5 .I understand that it has 3g cache howver a 20gb file creation takes too much time here are my results any ideas why
time dd if=/dev/zero of=disk.img bs=1048576 count=20000
20000+0 records in
20000+0 records out
997.59s... (2 Replies)