Software RAID on Linux Post: 302282338

9 More Discussions You Might Find Interesting

1. SuSE

Raid software besides Veritass

Hello Lunix people, I am looking for Raid software or solution besides Veritas. Veritas has some great software but are way too costly. Does anyone know of good raid software that but NOT Veritas. I need the funcations but not the cost.

2. UNIX for Advanced & Expert Users

Software RAID ...

Hi all, I m just trying using software RAID in RHEL 4, without problem , then i wanna simulate if disk 1 is fail (thereis an bootloader), i plug off my 1st disk. My problems is the second disk cannot boot? just stuck in grub, the computer is hang. Sorry for poor concept in RAID? I use a RAID 1....

3. HP-UX

Software RAID (0+1)

Hi! A couple of months ago a disk failed in our JBOD cabinett and I have finally got a new disk to replace it with. It was a RAID 0 so we have to create and configure the whole thing again. First we thought of RAID 1+0 but it seems you can't do this with LVM. If you read my last thread, you can...

4. Red Hat

Software RAID doubt

hi friends, I am having issues with adding a spare device to a failed array. I have created RAID 1 with 3 partitions using mdadm command. Later I added a spare with mdadm --add /dev/md0 /dev/sdb6 This works fine and when I check this with mdadm --detail command it just sits there as a spare...

5. Filesystems, Disks and Memory

Software RAID

Hello, My company has inherited a Centos based machine that has 7 hard drives and a software based raid system. Supposedly one of the drives has failed. I need to replace the hardrive. How can I about telling which hard drive needs replacing? I have looked in the logs and there clearly is a...

6. UNIX for Dummies Questions & Answers

RAID software vs hardware RAID

Hi Can someone tell me what are the differences between software and hardware raid ? thx for help.

7. Red Hat

Software RAID configuration

We have configured software based RAID5 with LVM on our RHEL5 servers. Please let us know if its good to configure software RAID on live environment servers. What can be the disadvantages of software RAID against hardware RAID

8. Solaris

Software RAID on top of Hardware RAID

Server Model: T5120 with 146G x4 disks. OS: Solaris 10 - installed on c1t0d0. Plan to use software raid (veritas volume mgr) on c1t2d0 disk. After format and label the disk, still not able to detect using vxdiskadm. Question: Should I remove the hardware raid on c1t2d0 first? My...

9. Solaris

Hardware to software RAID migration

We have hardware RAID configured on our T6320 server and two LDOMs are running on this server. One of our disk got failed and replaced. After replacemnt the newly installed disk not detected by RAID controlled so Oracle suggested to upgrade the REM firmware. As this is the standalone production...

LEARN ABOUT XFREE86

cciss_vol_status

CCISS_VOL_STATUS(8)													       CCISS_VOL_STATUS(8)

NAME

       cciss_vol_status - show status of logical drives attached to HP Smartarray controllers

SYNOPSIS

       cciss_vol_status [OPTION] [DEVICE]...

DESCRIPTION

       Shows the status of logical drives configured on HP Smartarray controllers.

OPTIONS

       -p, --persnickety
	      Without  this  option,  device  nodes  which  can't  be opened, or which are not found to be of the correct device type are silently
	      ignored.	This lets you use wildcards, e.g.: cciss_vol_status /dev/sg* /dev/cciss/c*d0, and the program will not complain as long as
	      all  devices which are found to be of the correct type are found to be ok.  However, you may wish to explicitly list the devices you
	      expect to be there, and be notified if they are not there (e.g. perhaps a PCI slot has died, and the system has  rebooted,  so  that
	      what  was  once  /dev/cciss/c1d0	is  no longer there at all).  This option will cause the program to complain about any device node
	      listed which does not appear to be the right device type, or is not openable.

       -C, --copyright
	      If stderr is a terminal, Print out a copyright message, and exit.

       -q, --quiet
	      This option doesn't do anything.	Previously, without this option and if stderr is a terminal, a copyright message precedes the nor-
	      mal program output.  Now, the copyright message is only printed via the -C option.

       -s     Query each physical drive for S.M.A.R.T data and report any drives in "predictive failure" state.

       -u, --try-unknown-devices
	      If a device has an unrecognized board ID, normally the program will not attempt to communicate with it.  In case you have some Smart
	      Array controller which is newer than this program, the program may not recognize it.  This option permits the program to attempt	to
	      interrogate the board even if it is unrecognized on the assumption that it is in fact a Smart Array of some kind.

       -v, --version
	      Print the version number and exit.

       -x, --exhaustive
	      Deprecated.   Previously, it "exhaustively" searched for logical drives, as, under some circumstances some logical drives might oth-
	      erwise be missed.  This option no longer does anything, as the algorithm for finding logical drives was changed to obviate the  need
	      for it.

DEVICE

       The  DEVICE  argument  indicates  which RAID controller is to be queried.  Note, that it indicates which RAID controller, not which logical
       drive.

       For the cciss driver, the "d0" nodes matching "/dev/cciss/c*d0" are the nodes which correspond to  the  RAID  controllers.   (See  note	1,
       below.)	It is not necessary to invoke cciss_vol_status on each logical drive individually, though if you do this, each time it will report
       the status of ALL logical drives on the controller.

       For the hpsa driver, or for fibre attached MSA1000 family devices, or for the hpahcisr sotware RAID driver which emulates Smart Arrays, the
       RAID  controller  is  accessed via the scsi generic driver, and the device nodes will match "/dev/sg*"	Some variants of the "lsscsi" tool
       will easily identify which device node corresponds  to  the  RAID  controller.	Some  variants	may  only  report  the	SCSI  nexus  (con-
       troller/bus/target/lun tuple.)  Some distros may not have the lsscsi tool.

       Executing the following query to the /sys filesystem and correlating this with the contents of /proc/scsi/scsi or output of lsscsi can help
       in finding the right /dev/sg node to use with cciss_vol_status:

       wumpus:/home/scameron # ls -l /sys/class/scsi_generic/*
       lrwxrwxrwx 1 root root 0 2009-11-18 12:31 /sys/class/scsi_generic/sg0 -> ../../devices/pci0000:00/0000:00:02.0/0000:02:00.0/0000:03:03.0/host0/target0:0:0/0:0:0:0/scsi_generic/sg0
       lrwxrwxrwx 1 root root 0 2009-11-18 12:31 /sys/class/scsi_generic/sg1 -> ../../devices/pci0000:00/0000:00:1f.1/host2/target2:0:0/2:0:0:0/scsi_generic/sg1
       lrwxrwxrwx 1 root root 0 2009-11-19 07:47 /sys/class/scsi_generic/sg2 -> ../../devices/pci0000:00/0000:00:05.0/0000:0e:00.0/host4/target4:3:0/4:3:0:0/scsi_generic/sg2
       wumpus:/home/scameron # cat /proc/scsi/scsi
       Attached devices:
       Host: scsi0 Channel: 00 Id: 00 Lun: 00
	 Vendor: COMPAQ   Model: BD03685A24	  Rev: HPB6
	 Type:	 Direct-Access			  ANSI	SCSI revision: 03
       Host: scsi2 Channel: 00 Id: 00 Lun: 00
	 Vendor: SAMSUNG  Model: CD-ROM SC-148A   Rev: B408
	 Type:	 CD-ROM 			  ANSI	SCSI revision: 05
       Host: scsi4 Channel: 03 Id: 00 Lun: 00
	 Vendor: HP	  Model: P800		  Rev: 6.82
	 Type:	 RAID				  ANSI	SCSI revision: 00
       wumpus:/home/scameron # lsscsi
       [0:0:0:0]    disk    COMPAQ   BD03685A24       HPB6  /dev/sda
       [2:0:0:0]    cd/dvd  SAMSUNG  CD-ROM SC-148A   B408  /dev/sr0
       [4:3:0:0]    storage HP	     P800	      6.82  -

       From the above you can see that /dev/sg2 corresponds to SCSI nexus 4:3:0:0, which corresponds to the HP	P800  RAID  controller	listed	in
       /proc/scsi/scsi.

EXAMPLE

	    [root@somehost]# cciss_vol_status -q /dev/cciss/c*d0
	    /dev/cciss/c0d0: (Smart Array P800) RAID 0 Volume 0 status: OK.
	    /dev/cciss/c0d0: (Smart Array P800) RAID 0 Volume 1 status: OK.
	    /dev/cciss/c0d0: (Smart Array P800) RAID 1 Volume 2 status: OK.
	    /dev/cciss/c0d0: (Smart Array P800) RAID 5 Volume 4 status: OK.
	    /dev/cciss/c0d0: (Smart Array P800) RAID 5 Volume 5 status: OK.
	    /dev/cciss/c0d0: (Smart Array P800) Enclosure MSA60 (S/N: USP6340B3F) on Bus 2, Physical Port 1E status: Power Supply Unit failed
	    /dev/cciss/c1d0: (Smart Array P800) RAID 5 Volume 0 status: OK.
	    /dev/cciss/c1d0: (Smart Array P800) RAID 5 Volume 1 status: OK.
	    /dev/cciss/c1d0: (Smart Array P800) RAID 5 Volume 2 status: OK.
	    /dev/cciss/c1d0: (Smart Array P800) RAID 5 Volume 3 status: OK.
	    /dev/cciss/c1d0: (Smart Array P800) RAID 5 Volume 4 status: OK.
	    /dev/cciss/c1d0: (Smart Array P800) RAID 5 Volume 5 status: OK.
	    /dev/cciss/c1d0: (Smart Array P800) RAID 5 Volume 6 status: OK.
	    /dev/cciss/c1d0: (Smart Array P800) RAID 5 Volume 7 status: OK.

	    [root@someotherhost]# cciss_vol_status -q /dev/sg0 /dev/cciss/c*d0
	    /dev/sg0: (MSA1000) RAID 1 Volume 0 status: OK.   At least one spare drive.
	    /dev/sg0: (MSA1000) RAID 5 Volume 1 status: OK.
	    /dev/cciss/c0d0: (Smart Array P800) RAID 0 Volume 0 status: OK.

	    [root@localhost]# ./cciss_vol_status -s /dev/sg1
	    /dev/sda: (Smart Array P410i) RAID 0 Volume 0 status: OK.
		  connector 1I box 1 bay 1		   HP	   DG072A9BB7				    B365P6803PCP0633	 HPD0 S.M.A.R.T. predictive failure.
	    [root@localhost]# echo $?
	    1

	    [root@localhost]# ./cciss_vol_status -s /dev/cciss/c0d0
	    /dev/cciss/c0d0: (Smart Array P800) RAID 0 Volume 0 status: OK.
		  connector 2E box 1 bay 8		   HP	   DF300BB6C3				3LM08AP700009713RXUT	 HPD3 S.M.A.R.T. predictive failure.
	    /dev/cciss/c0d0: (Smart Array P800) Enclosure MSA60 (S/N: USP6340B3F) on Bus 2, Physical Port 2E status: OK.

DIAGNOSTICS

       Normally, a logical drive in good working order should report a status of "OK."	Possible status values are:

       "OK." (0) - The logical drive is in good working order.

       "FAILED." (1) - The logical drive has failed, and no i/o to it is poosible.
	      Additionally,  failed  drives  will  be  identified by connector, box and bay, as well as vendor, model, serial number, and firmware
	      revision.

       "Using interim recovery mode." (3) - One or more drives has failed,
	      but not so many that the logical drive can no longer operate.  The failed drives should be replaced as soon as possible.

       "Ready for recovery operation." (4) -  Failed drive(s) have been
	      replaced, and the controller is about to begin rebuilding redundant parity data.

       "Currently recovering." (5) - Failed drive(s) have been replaced,
	      and the controller is currently rebuilding redundant parity information.

       "Wrong physical drive was replaced." (6) - A drive has failed, and
	      another (working) drive was replaced.

       "A physical drive is not properly connected." (7) - There is some
	      cabling or backplane problem in the drive enclosure.

       (From fwspecwww.doc, see cpqarray project on sourceforge.net):
	      Note: If the unit_status value is 6 (Wrong physical drive was replaced) or 7 (A physical	drive  is  not	properly  connected),  the
	      unit_status  of all other configured logical drives will be marked as 1 (Logical drive failed). This is to force the user to correct
	      the problem and to insure that once the problem is corrected, the data will not have been corrupted by any user action.

       "Hardware is overheating." (8) - Hardware is too hot.

       "Hardware was overheated." (9) - At some point in the past,
	      the hardware got too hot.

       "Currently expannding." (10) - The controller is currently in the
	      process of expanding a logical drive.

       "Not yet available." (11) - The logical drive is not yet finished
	      being configured.

       "Queued for expansion." (12) - The logical drive will be expended
	      when the controller is able to begin working on it.

       Additionally, the following messages may appear regarding spare drive status:

	    "At least one spare drive designated"
	    "At least one spare drive activated and currently rebuilding"
	    "At least one activated on-line spare drive is completely rebuilt on this logical drive"
	    "At least one spare drive has failed"
	    "At least one spare drive activated"
	    "At least one spare drive remains available"
       Active spares will be identified by connector, box and bay, as well
       as by vendor, model, serial number, and firmware revision.

       For each logical drive, the total number of failed physical drives, if more than zero, will be reported as:

		   "Total of n failed physical drives detected on this logical drive."

       with "n" replaced by the actual number, of course.

       "Replacement" drives -- newly inserted drives that replace a previously failed drive but are not yet finished rebuilding -- are also  iden-
       tified by connector, box and bay, as well as by vendor, model, serial number, and firmware revision.

       If  the	-s option is specified, each physical drive will be queried for S.M.A.R.T data, any any drives in predictive failure state will be
       reported, identified by connector, box and bay, as well as vendor, model, serial number, and firmware revision.

       Additionally failure conditions of disk enclosure fans, power supplies, and temperature are reported as follows:

	    "Fan failed"
	    "Temperature problem"
	    "Door alert"
	    "Power Supply Unit failed"

FILES

       /dev/cciss/c*d0 (Smart Array PCI controllers using the cciss driver)
       /dev/sg* (Fibre attached MSA1000 controllers and Smart Array controllers using the hpsa driver or hpahcisr software RAID driver.)

EXIT CODES

       0 - All configured logical drives queried have status of "OK."

       1 - One or more configured logical drives queried have status other than "OK."

AUTHOR

       Written by Stephen M. Cameron

REPORTING BUGS

       MSA500 G1 logical drive numbers may not be reported correctly.

       I've seen enclosure serial numbers contain garbage.

       Report bugs to <steve.cameron@hp.com>

COPYRIGHT

       Copyright (C) 2007 Hewlett-Packard Development Company, L.P.
       This is free software; see the source for copying conditions.  There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICU-
       LAR PURPOSE.

SEE ALSO

       http://cciss.sourceforge.net

NOTE 1
       The  /dev/cciss/c*d0  device  nodes of the cciss driver do double duty.	They serve as an access point to both the RAID controllers, and to
       the first logical drive of each RAID controller.  Notice that a /dev/cciss/c*d0 node will be present for each controller even if no logical
       drives  are configured on that controller.  It might be cleaner if the driver had a special device node just for the controller, instead of
       making these device nodes do double duty.  It has been like that since the 2.2 linux kernel timeframe.  At  that  time,	device	major  and
       minor  nodes were statically allocated at compile time, and were in short supply.  Changing this behavior at this point would break lots of
       userland programs.

cciss_vol_status (ccissutils)					     Nov 2009						       CCISS_VOL_STATUS(8)