"Reconstructing RAID"

Post #302695721 by tonyaldr on Monday 3rd of September 2012 02:37:34 PM

Full Discussion: Reconstructing RAID
Reconstructing RAID

I am trying to reconstruct a failed 4 disk RAID5 Western Digital ShareSpace device using 3 of the 4 disks connected via USB to an Ubuntu 12.04 machine. I get what seems like a successful re-assemble from -

mdadm --assemble --force /dev/md2 /dev/sde4 /dev/sdf4 /dev/sdg4
mdadm: /dev/md2 has been started with 3 drives (out of 4).
But then when I try to mount, it fails.  I am logged in as root and when I try to troubleshoot with mdadm, I get odd returns such as -
mdadm --examine /dev/md2
mdadm: No md superblock detected on /dev/md2.
Also, the system can't seem to find the volume -
vgscan -v
    Wiping cache of LVM-capable devices
    Wiping internal VG cache
  Reading all physical volumes.  This may take a while...
    Finding all volume groups
  No volume groups found

I read in some other posts that the WD system uses LVM2.  Could that be the issue?  Here is the output from mdadm --detail
mdadm --detail /dev/md2
        Version : 0.90
  Creation Time : Mon Oct 19 10:26:15 2009
     Raid Level : raid5
     Array Size : 5854981248 (5583.75 GiB 5995.50 GB)
  Used Dev Size : 1951660416 (1861.25 GiB 1998.50 GB)
   Raid Devices : 4
  Total Devices : 3
Preferred Minor : 2
    Persistence : Superblock is persistent

    Update Time : Sun Sep  2 15:22:50 2012
          State : clean, degraded 
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           UUID : 4c4952ae:1477d756:234bdad8:bdaa1368
         Events : 0.9246753

    Number   Major   Minor   RaidDevice State
       0       8       84        0      active sync   /dev/sdf4
       1       8       68        1      active sync   /dev/sde4
       2       0        0        2      removed
       3       8      100        3      active sync   /dev/sdg4

Here's the result of the mount attempt-
mount -t auto dev/md2 /mnt/raid
mount: special device dev/md2 does not exist

Appreciate any assistance! Thanx!

Last edited by Neo; 11-21-2017 at 10:59 AM..
Test Your Knowledge in Computers #321
Difficulty: Medium
DHCP stands for Dynamic Host Configuration Port.
True or False?

9 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

reconstructing a record in a diffrent order

Can sed be used to take a existing record and reverse the order of defined character placement if there is no delimeters? existing record: 0123456789CO expected result: 9876543210CO if there were delimeters I could define the delimeter and each placement would have an id which I... (1 Reply)
Discussion started by: r1500
1 Replies

2. UNIX for Dummies Questions & Answers

regarding raid

Hello, I am aware that our system has two hard drives with raid but i'm not sure as to the type of raid the system uses. I tried this. # df Filesystem 512-blocks Free %Used Iused %Iused Mounted on /dev/hd4 229376 76272 67% 6748 12% / /dev/hd2 3080192... (1 Reply)
Discussion started by: h1timmboy
1 Replies

3. UNIX for Dummies Questions & Answers

RAID software vs hardware RAID

Hi Can someone tell me what are the differences between software and hardware raid ? thx for help. (2 Replies)
Discussion started by: presul
2 Replies

4. Solaris

implementing RAID 1 from RAID 5

Dear ALl, I have a RAID 5 volume which is as below d120 r 60GB c1t2d0s5 c1t3d0s5 c1t4d0s5 c1t5d0s5 d7 r 99GB c1t2d0s0 c1t3d0s0 c1t4d0s0 c1t5d0s0 d110 r 99GB c1t2d0s4 c1t3d0s4 c1t4d0s4 c1t5d0s4 d8 r 99GB c1t2d0s1 c1t3d0s1... (2 Replies)
Discussion started by: jegaraman
2 Replies

5. Solaris

Creation of Raid 01 and Raid 10

Hello All, I have read enough of texts on Raid 01 and Raid 10 on solaris :wall: . But no-where found a way to create them using SVM. Some one pls tell me how to do or Post some link if that helps. TIA Curious solarister (1 Reply)
Discussion started by: Solarister
1 Replies

6. AIX

SCSI PCI - X RAID Controller card RAID 5 AIX Disks disappeared

Hello, I have a scsi pci x raid controller card on which I had created a disk array of 3 disks when I type lspv ; I used to see 3 physical disks ( two local disks and one raid 5 disk ) suddenly the raid 5 disk array disappeared ; so the hardware engineer thought the problem was with SCSI... (0 Replies)
Discussion started by: filosophizer
0 Replies

7. UNIX for Dummies Questions & Answers

Need help with RAID.

Hi Gurus, Can any one explain me the difference between hardware RAID and s/w RAID. Thanks in Advance. (1 Reply)
Discussion started by: rama krishna
1 Replies

8. Solaris

Software RAID on top of Hardware RAID

Server Model: T5120 with 146G x4 disks. OS: Solaris 10 - installed on c1t0d0. Plan to use software raid (veritas volume mgr) on c1t2d0 disk. After format and label the disk, still not able to detect using vxdiskadm. Question: Should I remove the hardware raid on c1t2d0 first? My... (4 Replies)
Discussion started by: KhawHL
4 Replies

9. Red Hat

RAID Configuration for IBM Serveraid-7k SCSI RAID Controller

Hello, I want to delete a RAID configuration an old server has. Since i haven't the chance to work with the specific raid controller in the past can you please help me how to perform the configuraiton? I downloaded IBM ServeRAID Support CD but i wasn't able to configure the video card so i... (0 Replies)
Discussion started by: @dagio
0 Replies
volwatch(8)						      System Manager's Manual						       volwatch(8)

volwatch - Monitors the Logical Storage Manager (LSM) for failure events and performs hot sparing SYNOPSIS
/usr/sbin/volwatch [-m] [-s] [-o] [mail-addresses...] OPTIONS
Runs volwatch with the mail notification support to notify root (by default) or other specified users when a failure occurs. This option is started by default. Runs volwatch with hot spare support. Specifies an argument to pass directly to volrecover if it is running and hot spare support is enabled. DESCRIPTION
The volwatch command monitors LSM waiting for exception events to occur. When an exception event occurs, the volwatch command uses mailx(1) to send mail to: The root account. The user accounts specified when you use the rcmgr command to set the VOLWATCH_USERS variable in the /etc/rc.config.common file. The user account that you specify on the command line with the volwatch command. The volwatch command uses the volnotify command to wait for events to occur. When an event occurs, there is a 15 second delay before the failure is analyzed and the message is sent. This delay allows a group of related events to be collected and reported in a single mail message. By default, the volwatch command automatically starts when the system boots. You can enter the volwatch -s command to start the volwatch command with hot-spare support. Hot-spare support: Detects LSM events result- ing from the failure of a disk, plex, or RAID5 subdisk. Sends mail to the root account (and other specified accounts) with notification about the failure and identifies the affected LSM objects. Determines which subdisks to relocate, finds space for those subdisks in the disk group, relocates the subdisks, and notifies the root account (and other specified accounts) of these actions and their success or failure. When a partial disk failure occurs (that is, a failure affecting only some subdisks on a disk), redundant data on the failed portion of the disk is relocated and the existing volumes comprised of the unaffected portions of the disk remain accessible. Note Hot-sparing is only performed for redundant (mirrored or RAID5) subdisks on a failed disk. Non-redundant subdisks on a failed disk are not relocated, but you are notified of the failure. Only one volwatch daemon can be running on a system or cluster node at any time. Hot-sparing does not guarantee the same layout of data or the same performance after relocation. You may want to make some configuration changes after hot-sparing occurs. Mail Notification Support The following is a sample mail notification when a failure is detected: Failures have been detected by the Logical Storage Manager: failed disks: medianame ... failed plexes: plexname ... failed log plexes: plexname ... failing disks: medianame ... failed subdisks: subdiskname ... The Logical Storage Manager will attempt to find spare disks, relocate failed subdisks and then recover the data in the failed plexes. The following describes the sections of the mail message: The medianame list under failed disks specifies disks that appear to have com- pletely failed; The medianame list under failing disks indicates a partial disk failure or a disk that is in the process of failing. When a disk has failed completely, the same medianame list appears under both failed disks: and failing disks. The plexname list under failed plexes shows plexes that have been detached due to I/O failures experienced while attempting to do I/O to subdisks they contain. The plex- name list under failed log plexes indicates RAID5 or dirty region log (DRL) plexes that have experienced failures. The subdiskname list specifies subdisks in RAID5 volumes that have been detached due to I/O errors. Enabling Hot-Sparing By default, hot-sparing is disabled. To enable hot-sparing, enter the volwatch command with the -s option, for example: # volwatch -s To use hot-spare support you should configure a disk as a spare, which identifies the disk as an available site for relocating failed sub- disks. Disks that are identified as spares are not used for normal allocations unless you explicitly specify otherwise. This ensures that there is a pool of spare disk space available for relocating failed subdisks and that this disk space is not consumed by normal operations. Spare disk space is the first space used to relocate failed subdisks. However, if no spare disk space is available or if the available spare disk space is not suitable or sufficient, free disk space is used. You must initialize a spare disk and place it in a disk group as a spare before it can be used for replacement purposes. If no disks are designated as spares when a failure occurs, LSM automatically uses any available free disk space in the disk group in which the failure occurs. If there is not enough spare disk space, a combination of spare disk space and free disk space is used. When hot-sparing selects a disk for relocation, it preserves the redundancy characteristics of the LSM object to which the relocated sub- disk belongs. For example, hot-sparing ensures that subdisks from a failed plex are not relocated to a disk containing a mirror of the failed plex. If redundancy cannot be preserved using available spare disks and/or free disk space, hot-sparing does not take place. If relocation is not possible, mail is sent indicating that no action was taken. When hot-sparing takes place, the failed subdisk is removed from the configuration database and LSM takes precautions to ensure that the disk space used by the failed subdisk is not recycled as free disk space. Initializing and Removing Hot-Spare Disks Although hot-sparing does not require you to designate disks as spares, Compaq recommends that you initialize at least one disk as a spare within each disk group; this gives you control over which disks are used for relocation. If no spare disks exist, LSM uses available free disk space within the disk group. When free disk space is used for relocation purposes, it is likely that there may be performance degra- dation after the relocation. Follow these guidelines when choosing a disk to configuring as a spare: The hot-spare feature works best if you specify at least one spare disk in each disk group containing mirrored or RAID5 volumes. If a given disk group spans multiple controllers and has more than one spare disk, set up the spare disks on different controllers (in case one of the controllers fails). For a mirrored volume, the disk group must have at least one disk that does not already contain one of the volume's mirrors. This disk should either be a spare disk with some avail- able space or a regular disk with some free space. For a mirrored and striped volume, the disk group must have at least one disk that does not already contain one of the volume's mirrors or another subdisk in the striped plex. This disk should either be a spare disk with some available space or a regular disk with some free space. For a RAID5 volume, the disk group must have at least one disk that does not already contain the volume's RAID5 plex or one of its log plexes. This disk should either be a spare disk with some available space or a regular disk with some free space. If a mirrored volume has a DRL log subdisk as part of its data plex (for example, volprint does not list the plex length as LOGONLY), that plex cannot be relocated. Therefore, place log subdisks in plexes that contain no data (log plexes). By default, the volassist command creates log plexes. For mirroring the root disk, the rootdg disk group should contain an empty spare disk that satisfies the restrictions Although it is possible to build LSM objects on spare disks, it is preferable to use spare disks for hot-spare only. When relocating subdisks off a failed disk, LSM attempts to use a spare disk large enough to hold all data from the failed disk. To initialize a disk as a spare that has no associated subdisks, use the voldiskadd command and enter y at the following prompt: Add disk as a spare disk for newdg? [y,n,q,?] (default: n) y To initialize an existing LSM disk as a spare disk, enter: # voledit set spare=on medianame For example, to initialize a disk called test03 as a spare disk, enter: # voledit set spare=on test03 To remove a disk as a spare, enter: # voledit set spare=off medianame For example, to make a disk called test03 available for normal use, enter: # voledit set spare=off test03 Replacement Procedure In the event of a disk failure, mail is sent, and if volwatch was configured to run with hot sparing support with the -s option, volwatch attempts to relocate any subdisks that appear to have failed. This involves finding appropriate spare disk or free disk space in the same disk group as the failed subdisk. To determine which disk from among the eligible spare disks to use, volwatch tries to use the disk that is closest to the failed disk. The value of closeness depends on the controller, target, and disk number of the failed disk. For example, a disk on the same controller as the failed disk is closer than a disk on a different controller; a disk under the same target as the failed disk is closer than one under a different target. If no spare or free disk space is found, the following mail message is sent explaining the disposition of volumes on the failed disk: Relocation was not successful for subdisks on disk dm_name in volume v_name in disk group dg_name. No replacement was made and the disk is still unusable. The following volumes have storage on medianame: volumename ... These volumes are still usable, but the redundancy of those volumes is reduced. Any RAID-5 volumes with storage on the failed disk may become unusable in the face of further failures. If non-RAID5 volumes are made unusable due to the failure of the disk, the following is included in the mail message: The following volumes: volumename ... have data on medianame but have no other usable mirrors on other disks. These volumes are now unusable and the data on them is unavailable. These volumes must have their data restored. If RAID5 volumes are made unavailable due to the disk failure, the following message is included in the mail message: The following RAID-5 volumes: volumename ... have storage on medianame and have experienced other failures. These RAID-5 volumes are now unusable and data on them is unavailable. These RAID-5 volumes must have their data restored. If spare disk space is found, LSM attemps to set up a subdisk on the spare disk and use it to replace the failed subdisk. If this is suc- cessful, the volrecover command runs in the background to recover the contents of data in volumes on the failed disk. If the relocation fails, the following mail message is sent: Relocation was not successful for subdisks on disk dm_name in volume v_name in disk group dg_name. No replacement was made and the disk is still unusable. error message If any volumes (RAID5 or otherwise) are rendered unusable due to the failure, the following is included in the mail message: The following volumes: volumename ... have data on dm_name but have no other usable mirrors on other disks. These volumes are now unusable and the data on them is unavailable. These volumes must have their data restored. If the relocation procedure completes successfully and recovery is under way, the following mail message is sent: Volume v_name Subdisk sd_name relocated to newsd_name, but not yet recovered. Once recovery has completed, a message is sent relaying the outcome of the recovery procedure. If the recovery was successful, the follow- ing is included in the mail message: Recovery complete for volume v_name in disk group dg_name. If the recovery was not successful, the following is included in the mail message: Failure recovering v_name in disk group dg_name. SEE ALSO
mailx(1), rcmgr(8), voldiskadm(8), voledit(8), volintro(8), volrecover(8) volwatch(8)

Featured Tech Videos

All times are GMT -4. The time now is 06:40 PM.
Unix & Linux Forums Content Copyright 1993-2019. All Rights Reserved.
Privacy Policy