Sponsored Content
Top Forums UNIX for Advanced & Expert Users Identify failed disk in Linux RAID Post 302562352 by Loic Domaigne on Thursday 6th of October 2011 03:04:10 PM
Old 10-06-2011
Identify failed disk in Linux RAID

Good Evening,

2 years ago, I set up an Ubuntu file-server for a friend, who is a photograph amateur. Basically, the server offers a software RAID-5 that can be accessed remotely from a MAC. Unfortunately, I didn't labeled the hard drives (i.e. which physical drive corresponds to the /dev/sdX device).

Now a drive has failed, and the RAID-5 is at risk. I needed to find out which physical drive we have to replace, before we can rebuild the array. I have summed up below the procedure I'd follow. It would be great if some Linux software RAID connaisseur could review it. The more eyeballs, the better; and beside Linux RAID are quite new land for me.

1. stop raid system
# umount /dev/md1
# mdadm -S /dev/md1

2. Unplug one by one the hard drives. Looks in dmesg failure events for /dev/sdX. That way the mapping between the physical disk and the device /dev/sdX is step-by-step revealed.

3. Replace the failed disk, and partition it accordingly to what is expected.

4. Rebuild the mirror with the new disk
- get UUID with mdadm -query
- assemble array with that new disk: mdadm --assemble /dev/md -u XXX
- update /etc/mdadm.conf: mdadm --detail --scan >> /etc/mdadm.conf

You find below detailed information about the server set-up.

TIA,
Loïc

The setup:

Ubuntu server, 6 SATA Hard drives /dev/sda ... /dev/sdf

Each Drives (X=a..f) are partitioned as followed:
/sdX1 type Linux Partition
/sdX2 type swap
/sdX3 type extended
/sdX5 type RAID


The server has 2 software Raids:
/dev/md0 RAID1 /sda1 and /sdb1
/dev/md1 RAID5 /sda5, /sdb5, /sdc5, /sdd5, /sde5, /sdf5

The OS is located on /dev/md0, only application data are located on /dev/md1

The Failure:

A Fail event had been detected on md device /dev/md1.
It could be related to component device /dev/sdd5.
The /proc/mdstat file currently contains the following:

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md1 : active raid5 sde5[4] sdc5[2] sdd5[6](F) sdf5[5] sdb5[1] sda5[0]
9636429120 blocks level 5, 64k chunk, algorithm 2 [6/5] [UUU_UU]

md0 : active raid1 sdb1[1] sda1[0]
20506816 blocks [2/2] [UU]


unused devices: <none>
 

9 More Discussions You Might Find Interesting

1. Solaris

Upgrade disk in RAID 1

I need to upgrade 2 x 73 GB disk and replace with 2 x 146 GB disk in sun v240. These disks contain boot and swap files These are mirrored disks with RAID 1 I am trining to create the correct procedure. So far the procedure I have is as follows: # metastat State: Okay ... (5 Replies)
Discussion started by: photon
5 Replies

2. AIX

to identify failed pv

Hi friends,.... am sindhiya, i have joined as AIX level 1 support. help me to identify the failed pv in vg which has some 4 physical volumes? (2 Replies)
Discussion started by: sindhiya
2 Replies

3. AIX

how to identify the raid type on aix

hi how to identify the raid type on aix? thx (1 Reply)
Discussion started by: melanie_pfefer
1 Replies

4. Linux

how to identify the raid type on Linux?

Hi any idea on why I am getting this? /sbin/mdadm --detail /dev/md0 mdadm: md device /dev/md0 does not appear to be active. thanks. (2 Replies)
Discussion started by: melanie_pfefer
2 Replies

5. Filesystems, Disks and Memory

Failed raid 1 partition cannot re-add

I found out that the raid 1 was degraded: # cat /proc/mdstat Personalities : md3 : active raid1 sda5 sdb5 1822445428 blocks super 1.0 md2 : active raid1 sda3(F) sdb3 1073741688 blocks super 1.0 md1 : active raid1 sda2 sdb2 524276 blocks super 1.0 md0 : active raid1 sda1... (0 Replies)
Discussion started by: ZaNaToS
0 Replies

6. AIX

RAID 10 Failed Drive Swap

I am new to the AIX operating system and am seeking out some advice. We recently have had a drive go bad on our AIX server that is in a RAID 10 array. We have a replacement on the way. I was wondering what the correct steps are to swap out this drive. Does the server need to be powered off? Or can... (5 Replies)
Discussion started by: mpeter05
5 Replies

7. Shell Programming and Scripting

Identify failed file transfers during SFTP

Hi All, I have a pretty demanding requirement for an SFTP script I have been trying to put together. I have nearly 100 files (all with the names staring with T_PROD) generated in my local server daily. I need to transfer each of these files to a remote server via SFTP (that's a client... (6 Replies)
Discussion started by: Aviktheory11
6 Replies

8. Solaris

Patching on Raid 0 Disk

Dear All , We need to do patching on one Solaris Server , where we have raid 0 configured. What is the process to patch a Server if RAID 0 (Concat/Stripe) is there. Below is the sample output. # metadb flags first blk block count a m pc luo 16 ... (1 Reply)
Discussion started by: jegaraman
1 Replies

9. Solaris

Failed to identify flash rom on Sunfire V240 running Solaris 10

Hi Guys, I have performed OBP & ALOM upgrade on V240 system. One of my system, running Solaris 10, having issue to identify flash rom during ALOM 1.6.10 version upgrade (OBP upgraded to latest one). May I know what the reason of this error and how can I fix it so I can upgrade ALOM using... (0 Replies)
Discussion started by: myrpthidesis
0 Replies
MDADM.CONF(5)							File Formats Manual						     MDADM.CONF(5)

NAME
mdadm.conf - configuration for management of Software RAID with mdadm SYNOPSIS
/etc/mdadm/mdadm.conf DESCRIPTION
mdadm is a tool for creating, managing, and monitoring RAID devices using the md driver in Linux. Some common tasks, such as assembling all arrays, can be simplified by describing the devices and arrays in this configuration file. SYNTAX The file should be seen as a collection of words separated by white space (space, tab, or newline). Any word that beings with a hash sign (#) starts a comment and that word together with the remainder of the line is ignored. Any line that starts with white space (space or tab) is treated as though it were a continuation of the previous line. Empty lines are ignored, but otherwise each (non continuation) line must start with a keyword as listed below. The keywords are case insensitive and can be abbreviated to 3 characters. The keywords are: DEVICE A device line lists the devices (whole devices or partitions) that might contain a component of an MD array. When looking for the components of an array, mdadm will scan these devices (or any devices listed on the command line). The device line may contain a number of different devices (separated by spaces) and each device name can contain wild cards as defined by glob(7). Also, there may be several device lines present in the file. Alternatively, a device line can contain either of both of the words containers and partitions. The word containers will cause mdadm to look for assembled CONTAINER arrays and included them as a source for assembling further arrays. The word partitions will cause mdadm to read /proc/partitions and include all devices and partitions found therein. mdadm does not use the names from /proc/partitions but only the major and minor device numbers. It scans /dev to find the name that matches the numbers. If no DEVICE line is present, then "DEVICE partitions containers" is assumed. For example: DEVICE /dev/hda* /dev/hdc* DEV /dev/sd* DEVICE /dev/disk/by-path/pci* DEVICE partitions ARRAY The ARRAY lines identify actual arrays. The second word on the line may be the name of the device where the array is normally assembled, such as /dev/md1 or /dev/md/backup. If the name does not start with a slash ('/'), it is treated as being in /dev/md/. Alternately the word <ignore> (complete with angle brackets) can be given in which case any array which matches the rest of the line will never be automatically assembled. If no device name is given, mdadm will use various heuristics to determine an appropriate name. Subsequent words identify the array, or identify the array as a member of a group. If multiple identities are given, then a compo- nent device must match ALL identities to be considered a match. Each identity word has a tag, and equals sign, and some value. The tags are: uuid= The value should be a 128 bit uuid in hexadecimal, with punctuation interspersed if desired. This must match the uuid stored in the superblock. name= The value should be a simple textual name as was given to mdadm when the array was created. This must match the name stored in the superblock on a device for that device to be included in the array. Not all superblock formats support names. super-minor= The value is an integer which indicates the minor number that was stored in the superblock when the array was created. When an array is created as /dev/mdX, then the minor number X is stored. devices= The value is a comma separated list of device names or device name patterns. Only devices with names which match one entry in the list will be used to assemble the array. Note that the devices listed there must also be listed on a DEVICE line. level= The value is a raid level. This is not normally used to identify an array, but is supported so that the output of mdadm --examine --scan can be use directly in the configuration file. num-devices= The value is the number of devices in a complete active array. As with level= this is mainly for compatibility with the output of mdadm --examine --scan. spares= The value is a number of spare devices to expect the array to have. The sole use of this keyword and value is as follows: mdadm --monitor will report an array if it is found to have fewer than this number of spares when --monitor starts or when --oneshot is used. spare-group= The value is a textual name for a group of arrays. All arrays with the same spare-group name are considered to be part of the same group. The significance of a group of arrays is that mdadm will, when monitoring the arrays, move a spare drive from one array in a group to another array in that group if the first array had a failed or missing drive but no spare. auto= This option is rarely needed with mdadm-3.0, particularly if use with the Linux kernel v2.6.28 or later. It tells mdadm whether to use partitionable array or non-partitionable arrays and, in the absence of udev, how many partition devices to create. From 2.6.28 all md array devices are partitionable, hence this option is not needed. The value of this option can be "yes" or "md" to indicate that a traditional, non-partitionable md array should be created, or "mdp", "part" or "partition" to indicate that a partitionable md array (only available in linux 2.6 and later) should be used. This later set can also have a number appended to indicate how many partitions to create device files for, e.g. auto=mdp5. The default is 4. bitmap= The option specifies a file in which a write-intent bitmap should be found. When assembling the array, mdadm will provide this file to the md driver as the bitmap file. This has the same function as the --bitmap-file option to --assemble. metadata= Specify the metadata format that the array has. This is mainly recognised for comparability with the output of mdadm -Es. container= Specify that this array is a member array of some container. The value given can be either a path name in /dev, or a UUID of the container array. member= Specify that this array is a member array of some container. Each type of container has some way to enumerate member arrays, often a simple sequence number. The value identifies which member of a container the array is. It will usually accompany a "container=" word. MAILADDR The mailaddr line gives an E-mail address that alerts should be sent to when mdadm is running in --monitor mode (and was given the --scan option). There should only be one MAILADDR line and it should have only one address. MAILFROM The mailfrom line (which can only be abbreviated to at least 5 characters) gives an address to appear in the "From" address for alert mails. This can be useful if you want to explicitly set a domain, as the default from address is "root" with no domain. All words on this line are catenated with spaces to form the address. Note that this value cannot be set via the mdadm commandline. It is only settable via the config file. PROGRAM The program line gives the name of a program to be run when mdadm --monitor detects potentially interesting events on any of the arrays that it is monitoring. This program gets run with two or three arguments, they being the Event, the md device, and possibly the related component device. There should only be one program line and it should be give only one program. CREATE The create line gives default values to be used when creating arrays and device entries for arrays. These include: owner= group= These can give user/group ids or names to use instead of system defaults (root/wheel or root/disk). mode= An octal file mode such as 0660 can be given to override the default of 0600. auto= This corresponds to the --auto flag to mdadm. Give yes, md, mdp, part -- possibly followed by a number of partitions -- to indicate how missing device entries should be created. metadata= The name of the metadata format to use if none is explicitly given. This can be useful to impose a system-wide default of ver- sion-1 superblocks. symlinks=no Normally when creating devices in /dev/md/ mdadm will create a matching symlink from /dev/ with a name starting md or md_. Give symlinks=no to suppress this symlink creation. HOMEHOST The homehost line gives a default value for the --homehost= option to mdadm. There should normally be only one other word on the line. It should either be a host name, or one of the special words <system> and <ignore>. If <system> is given, then the gethost- name(2) systemcall is used to get the host name. If <ignore> is given, then a flag is set so that when arrays are being auto-assembled the checking of the recorded homehost is dis- abled. If <ignore> is given it is also possible to give an explicit name which will be used when creating arrays. This is the only case when there can be more that one other word on the HOMEHOST line. When arrays are created, this host name will be stored in the metadata. When arrays are assembled using auto-assembly, arrays which do not record the correct homehost name in their metadata will be assembled using a "foreign" name. A "foreign" name alway ends with a digit string preceded by an underscore to differentiate it from any possible local name. e.g. /dev/md/1_1 or /dev/md/home_0. AUTO A list of names of metadata format can be given, each preceded by a plus or minus sign. Also the word homehost is allowed as is all preceded by plus or minus sign. all is usually last. When mdadm is auto-assembling an array, either via --assemble or --incremental and it finds metadata of a given type, it checks that metadata type against those listed in this line. The first match wins, where all matches anything. If a match is found that was preceded by a plus sign, the auto assembly is allowed. If the match was preceded by a minus sign, the auto assembly is disallowed. If no match is found, the auto assembly is allowed. If the metadata indicates that the array was created for this host, and the word homehost appears before any other match, then the array is treated as a valid candidate for auto-assembly. This can be used to disable all auto-assembly (so that only arrays explicitly listed in mdadm.conf or on the command line are assem- bled), or to disable assembly of certain metadata types which might be handled by other software. It can also be used to disable assembly of all foreign arrays - normally such arrays are assembled but given a non-deterministic name in /dev/md/. The known metadata types are 0.90, 1.x, ddf, imsm. EXAMPLE
DEVICE /dev/sd[bcdjkl]1 DEVICE /dev/hda1 /dev/hdb1 # /dev/md0 is known by its UUID. ARRAY /dev/md0 UUID=3aaa0122:29827cfa:5331ad66:ca767371 # /dev/md1 contains all devices with a minor number of # 1 in the superblock. ARRAY /dev/md1 superminor=1 # /dev/md2 is made from precisely these two devices ARRAY /dev/md2 devices=/dev/hda1,/dev/hdb1 # /dev/md4 and /dev/md5 are a spare-group and spares # can be moved between them ARRAY /dev/md4 uuid=b23f3c6d:aec43a9f:fd65db85:369432df spare-group=group1 ARRAY /dev/md5 uuid=19464854:03f71b1b:e0df2edd:246cc977 spare-group=group1 # /dev/md/home is created if need to be a partitionable md array # any spare device number is allocated. ARRAY /dev/md/home UUID=9187a482:5dde19d9:eea3cc4a:d646ab8b auto=part MAILADDR root@mydomain.tld PROGRAM /usr/sbin/handle-mdadm-events CREATE group=system mode=0640 auto=part-8 HOMEHOST <system> AUTO +1.x homehost -all SEE ALSO
mdadm(8), md(4). MDADM.CONF(5)
All times are GMT -4. The time now is 08:45 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy