RAID5 multi disk failure Post: 302592332

Sponsored Content

Top Forums UNIX for Advanced & Expert Users RAID5 multi disk failure Post 302592332 by chebarbudo on Monday 23rd of January 2012 02:18:59 PM

01-23-2012

Registered User

RAID5 multi disk failure

Hi there,

Don't know if my title is relevant but I'm dealing with dangerous materials that I don't really know and I'm very afraid to mess anything up.

I have a Debian 5.0.4 server with 4 x 1TB hard drives.

I have the following mdstat

Code:

Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty]
md1 : active raid1 sda1[0] sdd1[3] sdb1[1] sdc1[2]
      1024896 blocks [4/4] [UUUU]

md5 : active raid1 sda5[0] sdd5[3] sdb5[1] sdc5[2]
      1023872 blocks [4/4] [UUUU]

md6 : active raid1 sda6[0] sdd6[3] sdb6[1]
      1023872 blocks [4/3] [UU_U]

md7 : active raid1 sda7[0] sdd7[3] sdb7[1] sdc7[2]
      1023872 blocks [4/4] [UUUU]

md8 : active raid1 sdd8[3] sdb8[1] sdc8[2]
      1023872 blocks [4/3] [_UUU]

unused devices: <none>

That's kind of weird because I use to have a huge md10 partition with a monstruous amount of important files.

I have no idea where to start!

I tried to examine the partitions in the multi-disk :

Code:

root@titan:~# mdadm --examine /dev/sda10
/dev/sda10:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 0b972a2e:3aaabcf9:a4d2adc2:26fd5302
  Creation Time : Sat Apr 17 16:30:50 2010
     Raid Level : raid5
  Used Dev Size : 1459502912 (1391.89 GiB 1494.53 GB)
     Array Size : 4378508736 (4175.67 GiB 4483.59 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 10

    Update Time : Sun Jun  5 16:00:41 2011
          State : clean
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0
       Checksum : ac3fac12 - correct
         Events : 2552115

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     0       8       10        0      active sync   /dev/sda10

   0     0       8       10        0      active sync   /dev/sda10
   1     1       8       26        1      active sync   /dev/sdb10
   2     2       8       42        2      active sync   /dev/sdc10
   3     3       8       58        3      active sync   /dev/sdd10
root@titan:~# mdadm --examine /dev/sdb10
/dev/sdb10:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 0b972a2e:3aaabcf9:a4d2adc2:26fd5302
  Creation Time : Sat Apr 17 16:30:50 2010
     Raid Level : raid5
  Used Dev Size : 1459502912 (1391.89 GiB 1494.53 GB)
     Array Size : 4378508736 (4175.67 GiB 4483.59 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 10

    Update Time : Mon Jan 23 12:05:02 2012
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 1
  Spare Devices : 0
       Checksum : ade16f37 - correct
         Events : 6224199

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     1       8       26        1      active sync   /dev/sdb10

   0     0       0        0        0      removed
   1     1       8       26        1      active sync   /dev/sdb10
   2     2       0        0        2      faulty removed
   3     3       8       58        3      active sync   /dev/sdd10
root@titan:~# mdadm --examine /dev/sdc10
/dev/sdc10:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 0b972a2e:3aaabcf9:a4d2adc2:26fd5302
  Creation Time : Sat Apr 17 16:30:50 2010
     Raid Level : raid5
  Used Dev Size : 1459502912 (1391.89 GiB 1494.53 GB)
     Array Size : 4378508736 (4175.67 GiB 4483.59 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 10

    Update Time : Fri Jan 20 23:16:43 2012
          State : active
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0
       Checksum : ad7f1c03 - correct
         Events : 6223465

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     2       8       42        2      active sync   /dev/sdc10

   0     0       0        0        0      removed
   1     1       8       26        1      active sync   /dev/sdb10
   2     2       8       42        2      active sync   /dev/sdc10
   3     3       8       58        3      active sync   /dev/sdd10
root@titan:~# mdadm --examine /dev/sdd10
/dev/sdd10:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 0b972a2e:3aaabcf9:a4d2adc2:26fd5302
  Creation Time : Sat Apr 17 16:30:50 2010
     Raid Level : raid5
  Used Dev Size : 1459502912 (1391.89 GiB 1494.53 GB)
     Array Size : 4378508736 (4175.67 GiB 4483.59 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 10

    Update Time : Mon Jan 23 12:05:02 2012
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 1
  Spare Devices : 0
       Checksum : ade16f5b - correct
         Events : 6224199

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     3       8       58        3      active sync   /dev/sdd10

   0     0       0        0        0      removed
   1     1       8       26        1      active sync   /dev/sdb10
   2     2       0        0        2      faulty removed
   3     3       8       58        3      active sync   /dev/sdd10

But that doesn't really help...
I have no idea how to interpret the results!
I'm scared with the "faulty" and "removed" warnings.
Can anyone give me a hint?
Is there any other command I can run to regain access to the data, at least read-only?

Thanks for your help.
Santiago

chebarbudo

View Public Profile for chebarbudo

Find all posts by chebarbudo

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

Disk failure

is there anu way by which i can find out if all the disks on the system are working ? Milind Shauche.

2. HP-UX

Disk Failure

I am new to being a Unix admin and have a question about replacing some hardware. I have a K class box using HP-UX 10.20 with three disks. Two of the drives are in one logical volume. Every 3 or 4 days, the syslog is showing that one of these drives is experiencing "POWERFAILED" and then recovering...

3. SCO

Raid5 Failure

Forgive me, I do not know much about RAID so I'm going to be as detailed as possible. This morning, our server's alarm was going. I found that one of our drives have failed. (we have 3) It is an Adaptec ATA RAID 2400A controller I'm purchasing a new SCSI drive today. My questions: ...

4. Filesystems, Disks and Memory

Looking for a solution to disk failure!

Hi people, I have been using my disk for quite a long time but the other day I heard the drive making some noise and had to restart the system again. But when I did that the disk would not boot and I fear that the data might be deleted or lost. So, if you people have any know about the ways to get...

5. Filesystems, Disks and Memory

Looking for a solution to disk failure!

6. Solaris

SAN disk failure

hi all, have a solaris 9 OS and a SAN disk which used to work fine is not getting picked up by my machine. can anyone point out things to check in order to troubleshoot this ?? thanks in advance.

7. Solaris

Configure disk array in RAID5 and create file system

I'm new to forums, it's my first time posting. I have a sun v490 server. I just installed solaris 10.6, on the local drives. I'm being asked to do the following: For Oracle install I need “oracle” user that belong to “dba” and “oinstall” groups. File system /u01/app/oracle, 10GB (if...

8. Red Hat

How to monitor HP server hard disk failure ?

in red hat 4, 5 any one know any commands or any scritps to monitor HP DL 380 G5/6 server and trigger alarm when hard disk failed. thanks for all support ---------- Post updated at 02:45 PM ---------- Previous update was at 12:00 PM ---------- does HP ProLiant Support Pack support is...

9. Solaris

Poor disk performance however no sign of failure

Hello guys, I have two servers performing the same disk operations. I believe one server is having a disk's impending failure however I have no hard evidence to prove it. This is a pair of Netra 210's with 2 drives in a hardware raid mirror (LSI raid controller). While performing intensive...

10. AIX

AIX hard disk failure

Hi all, I have encountered the issue with the hard disk, the disk is failed and need to replace by the new one. As my understanding, this is just to take out the failed disk and insert the new ones, and that's all. But the third party hardware vendor said, there should be another procedure...

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

Disk failure

Discussion started by: shauche

2. HP-UX

Disk Failure

Discussion started by: SemiOfCol

3. SCO

Raid5 Failure

Discussion started by: gseyforth

4. Filesystems, Disks and Memory

Looking for a solution to disk failure!

Discussion started by: adam466

5. Filesystems, Disks and Memory

Looking for a solution to disk failure!

Discussion started by: christopher4

6. Solaris

SAN disk failure

Discussion started by: cesarNZ

7. Solaris

Configure disk array in RAID5 and create file system

Discussion started by: Kjons76

8. Red Hat

How to monitor HP server hard disk failure ?

Discussion started by: maxlee24

9. Solaris

Poor disk performance however no sign of failure

Discussion started by: s ladd

10. AIX

AIX hard disk failure

Discussion started by: Phat