SVM metastat -- needs maintenance


 
Thread Tools Search this Thread
Operating Systems Solaris SVM metastat -- needs maintenance
# 1  
Old 01-26-2006
SVM metastat -- needs maintenance

Running Solaris 9 with SVM. I'm not that familiar with it, but metastat output gives "needs maintenance" message on 2 of the mirrors. There are no errors in /var/adm/messages. What do I need to do to fix this error? Thanks.

Quote:
# metastat
d50: Mirror
Submirror 0: d51
State: Needs maintenance
Submirror 1: d52
State: Needs maintenance
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 65431680 blocks (31 GB)

d51: Submirror of d50
State: Needs maintenance
Invoke: metareplace d50 c1t0d0s5 <new device>
Size: 65431680 blocks (31 GB)
Stripe 0:
Device Start Block Dbase State Reloc Hot Spare
c1t0d0s5 0 No Maintenance Yes


d52: Submirror of d50
State: Needs maintenance
Invoke: after replacing "Maintenance" components:
metareplace d50 c1t1d0s5 <new device>
Size: 65431680 blocks (31 GB)
Stripe 0:
Device Start Block Dbase State Reloc Hot Spare
c1t1d0s5 0 No Last Erred Yes


d40: Mirror
Submirror 0: d41
State: Okay
Submirror 1: d42
State: Okay
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 8201856 blocks (3.9 GB)

d41: Submirror of d40
State: Okay
Size: 8201856 blocks (3.9 GB)
Stripe 0:
Device Start Block Dbase State Reloc Hot Spare
c1t0d0s4 0 No Okay Yes


d42: Submirror of d40
State: Okay
Size: 8201856 blocks (3.9 GB)
Stripe 0:
Device Start Block Dbase State Reloc Hot Spare
c1t1d0s4 0 No Okay Yes


d30: Mirror
Submirror 0: d31
State: Okay
Submirror 1: d32
State: Okay
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 40968576 blocks (19 GB)

d31: Submirror of d30
State: Okay
Size: 40968576 blocks (19 GB)
Stripe 0:
Device Start Block Dbase State Reloc Hot Spare
c1t0d0s3 0 No Okay Yes


d32: Submirror of d30
State: Okay
Size: 40968576 blocks (19 GB)
Stripe 0:
Device Start Block Dbase State Reloc Hot Spare
c1t1d0s3 0 No Okay Yes


d10: Mirror
Submirror 0: d11
State: Needs maintenance
Submirror 1: d12
State: Needs maintenance
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 16393536 blocks (7.8 GB)

d11: Submirror of d10
State: Needs maintenance
Invoke: metareplace d10 c1t0d0s0 <new device>
Size: 16393536 blocks (7.8 GB)
Stripe 0:
Device Start Block Dbase State Reloc Hot Spare
c1t0d0s0 0 No Maintenance Yes


d12: Submirror of d10
State: Needs maintenance
Invoke: after replacing "Maintenance" components:
metareplace d10 c1t1d0s0 <new device>
Size: 16393536 blocks (7.8 GB)
Stripe 0:
Device Start Block Dbase State Reloc Hot Spare
c1t1d0s0 0 No Last Erred Yes


Device Relocation Information:
Device Reloc Device ID
c1t1d0 Yes id1,ssd@w2000000c50566da1
c1t0d0 Yes id1,ssd@w2000000c50568c1d
# 2  
Old 01-26-2006
just try the command metasync(1M)....

This may be caused by the "metasync -r" command not getting executed when the system boots, or if the system boots up only to single-user mode.

This metasync command is normally executed in one of the startup scripts run at boot time.

For Online: DiskSuite[TM] 1.0, the metasync command is located in the /etc/rc.local script. This entry is placed in that file by the metarc command.

For Solstice DiskSuite versions between 3.x and 4.2, inclusive, the metasync command is located in the /etc/rc2.d/S95SUNWmd.sync file.

For Solstice DiskSuite version 4.2.1 and above, the metasync command is located in the file /etc/rc2.d/S95lvm.sync.

In all cases, because this script is not run until the system transitions into run state 3 (multi-user mode), it is to be expected to have both submirrors in a "Needs maintenance" state until the command is run. I/O to these metadevices works just fine while in this state, so there is no need to worry.

if that doesn't help, you be in the situation discribed in bug 82642

When trying to run the metasync command, the c1t0d0s0 device reported errors in /var/adm/messages:

Sep 15 09:11:17 bobbob scsi: WARNING: /pci@8,600000/SUNW,qlc@2/fp@0,0/ssd@w2100002037f396c9,0 (ssd1):
Sep 15 09:11:17 bobbob Error for Command: read(10) Error Level: Retryable
Sep 15 09:11:17 bobbob scsi: Requested Block: 4057844 Error Block: 4057969
Sep 15 09:11:17 bobbob scsi: Vendor: SEAGATE Serial Number: 0107D1MVCF
Sep 15 09:11:17 bobbob scsi: Sense Key: Media Error
Sep 15 09:11:17 bobbob scsi: ASC: 0x11 (unrecovered read error), ASCQ: 0x0, FRU: 0xe4
Sep 15 09:11:19 bobbob scsi: WARNING: /pci@8,600000/SUNW,qlc@2/fp@0,0/ssd@w2100002037f396c9,0 (ssd1):
Sep 15 09:11:19 bobbob Error for Command: read(10) Error Level: Retryable
Sep 15 09:11:19 bobbob scsi: Requested Block: 4057844 Error Block: 4057969
Sep 15 09:11:19 bobbob scsi: Vendor: SEAGATE Serial Number: 0107D1MVCF
Sep 15 09:11:19 bobbob scsi: Sense Key: Media Error
Sep 15 09:11:19 bobbob scsi: ASC: 0x11 (unrecovered read error), ASCQ: 0x0, FRU: 0xe4


In this case, the same block is being reported as having problems.

Resolution:

The bad block can be fixed by running format --> analyze --> read on the c1t0d0 disk.

# format
Searching for disks...done


AVAILABLE DISK SELECTIONS:
0. c1t0d0 <SUN36G cyl 24620 alt 2 hd 27 sec 107>
/pci@8,600000/SUNW,qlc@2/fp@0,0/ssd@w2100002037f396c9,0
1. c1t1d0 <SUN36G cyl 24620 alt 2 hd 27 sec 107>
/pci@8,600000/SUNW,qlc@2/fp@0,0/ssd@w2100002037f8c663,0
Specify disk (enter its number): 0
selecting c1t0d0
format> analyze
analyze> read
Ready to analyze (won't harm SunOS). This takes a long time,
but is interruptable with CTRL-C. Continue? y


pass 0
Medium error during read: block 4057969 (0x3deb71) (1404/16/101)
ASC: 0x11 ASCQ: 0x0
Sep 15 09:26:59 bobbob scsi: WARNING: /pci@8,600000/SUNW,qlc@2/fp@0,0/ssd@w2100002037f396c9,0 (ssd1):
Sep 15 09:26:59 bobbob Error for Command: read(10) Error Level: Retryable
Sep 15 09:26:59 bobbob scsi: Requested Block: 4057969 Error Block: 4057969
Sep 15 09:26:59 bobbob scsi: Vendor: SEAGATE Serial Number: 0107D1MVCF
Sep 15 09:26:59 bobbob scsi: Sense Key: Media Error
Sep 15 09:26:59 bobbob scsi: ASC: 0x11 (unrecovered read error), ASCQ: 0x0, FRU: 0xe4
Repairing hard error on 4057969 (1404/16/101)...ok.


24619/26/53


pass 1
24619/26/53


Total of 1 defective blocks repaired.


Now running metasync completes.

# metasync d10
# metastat d10
d10: Mirror
Submirror 0: d0
State: Needs maintenance
Submirror 1: d1
State: Okay
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 69078879 blocks


d0: Submirror of d10
State: Needs maintenance
Invoke: after replacing "Maintenance" components:
metareplace d10 c1t0d0s0 <new device>
Size: 69078879 blocks
Stripe 0:
Device Start Block Dbase State Hot Spare
c1t0d0s0 0 No Last Erred


d1: Submirror of d10
State: Okay
Size: 69078879 blocks
Stripe 0:
Device Start Block Dbase State Hot Spare
c1t1d0s0 0 No Okay


And then, metareplace can be executed.

# metareplace -e d10 c1t0d0s0
# metastat d10
d10: Mirror
Submirror 0: d0
State: Okay
Submirror 1: d1
State: Okay
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 69078879 blocks


d0: Submirror of d10
State: Okay
Size: 69078879 blocks
Stripe 0:
Device Start Block Dbase State Hot Spare
c1t0d0s0 0 No Okay


d1: Submirror of d10
State: Okay
Size: 69078879 blocks
Stripe 0:
Device Start Block Dbase State Hot Spare
c1t1d0s0 0 No Okay

regards pressy
# 3  
Old 01-26-2006
Maybe I misunderstood your post, but here is what I did. It looks like nothing is happening and I dont see anything in the logs.

Code:
# metasync d50
# metastat d50
d50: Mirror
    Submirror 0: d51
      State: Needs maintenance
    Submirror 1: d52
      State: Needs maintenance
    Pass: 1
    Read option: roundrobin (default)
    Write option: parallel (default)
    Size: 65431680 blocks (31 GB)

d51: Submirror of d50
    State: Needs maintenance
    Invoke: metareplace d50 c1t0d0s5 <new device>
    Size: 65431680 blocks (31 GB)
    Stripe 0:
        Device     Start Block  Dbase        State Reloc Hot Spare
        c1t0d0s5          0     No     Maintenance   Yes


d52: Submirror of d50
    State: Needs maintenance
    Invoke: after replacing "Maintenance" components:
                metareplace d50 c1t1d0s5 <new device>
    Size: 65431680 blocks (31 GB)
    Stripe 0:
        Device     Start Block  Dbase        State Reloc Hot Spare
        c1t1d0s5          0     No      Last Erred   Yes


Device Relocation Information:
Device   Reloc  Device ID
c1t0d0   Yes    id1,ssd@w2000000c50568c1d
c1t1d0   Yes    id1,ssd@w2000000c50566da1

# 4  
Old 01-26-2006
It looks to me like you lost a disk: c1t1d0s5. I'll bet that "iostat -En" will confirm that. That format command that pressy shows does look interesting, but I don't like trying to repair a disk. I would replace it.
# 5  
Old 01-26-2006
I dont see anything strange in that output:
Quote:
iostat -En
c0t6d0 Soft Errors: 1 Hard Errors: 0 Transport Errors: 0
Vendor: TOSHIBA Product: DVD-ROM SD-M1711 Revision: 1005 Serial No:
Size: 0.00GB <0 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 1 Predictive Failure Analysis: 0
c1t1d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: SEAGATE Product: ST373307FSUN72G Revision: 0307 Serial No: 0334B1RPX4
Size: 73.40GB <73400057856 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
c1t0d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: SEAGATE Product: ST373307FSUN72G Revision: 0307 Serial No: 0334B1S42R
Size: 73.40GB <73400057856 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
Also, my root mirror is complaining. It's posted in the original post. Anyhow, how can I be sure that 1) its a disk failure 2) which disk I need to replace.
# 6  
Old 01-26-2006
With nothing showing up in iostat -En, now I think it probably isn't a bad disk. So I don't know what to tell you. Smilie
# 7  
Old 01-26-2006
I think you need to give more info - I noticed the ssd on one of your outputs.

What type of server? Are these internal drives to the server or in arrays?
What type of arrays (if they are)?

Where are your metadb state databases (found with metadb command with no options)?

What are the failing partitions? What's on the failing partitions (OS only, OS and Applications - and of course, what applications)?

I'm assuming that SVM was the standard with Solaris 9 - if not, please post the version of it.

Also, what if anything, was changed before you noticed all of this - reboots, upgrades,...etc.?

And you state no errors in messages file - is syslogd running? Do you normally get error messages on this system? Double check that you are looking at the correct file for errors by looking at syslog.conf.

Last edited by RTM; 01-26-2006 at 06:01 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Solaris

Metastat shows state needs maintenance

Hi, We have a Solaris 10 machine with update 11 and was configured with IBM storage. It was assigned 2 LUNs (each 70GB) which were striped to make it 140GB. we have taken full backup of entire machine and Our storage team replaced IBM storage with Nimble storage (they did storage-level... (6 Replies)
Discussion started by: prvnrk
6 Replies

2. Solaris

metastat |grep Needs

Dear, Required an script such that : If metastat |grep Needs , results in some output then this command to be executed for the same : opcmsg object=metastat a=OS msg_grp=OpC severity=critical msg_text="Need maintenance for the system $line" With regards, Mjoshi (3 Replies)
Discussion started by: mjoshi87
3 Replies

3. Shell Programming and Scripting

Grep contents from metastat command

Hi, after issuing metastat command I am getting output as follows Actually these soft partitions are more than 100. I want output as Device Name & Size. for eg d4004 2.0 GB (4 Replies)
Discussion started by: tuxian
4 Replies

4. Solaris

restore metastat configuration after solaris upgrade

Greetings How can i restore metastat db after a solaris upgrade (9 to 10)? will it work if i save and restore /etc/vfstab and /etc/lvm/md.cf file? root@netad# metastat d35: Mirror Submirror 0: d38 State: Okay Submirror 1: d39 State: Okay Pass:... (2 Replies)
Discussion started by: kashif_islam
2 Replies

5. Solaris

SVM Solaris 8 Problem. Metastat output looping

Hi friends, I'm newbie to SVM. Just wanna try installed it on one of our server (to do mirroring for disk0 and disk1) but i think im lost until now. :( the steps i've taken is as below:- 1.prtvtoc /dev/rdsk/c1t0d0s2 | fmthard -s - /dev/rdsk/c1t1d0s2 2.metadb -a -c 3 -f c1t0d0s7... (3 Replies)
Discussion started by: kronenose
3 Replies

6. Solaris

Metastat shows "maintenance" and "last-erred"

Hi All, Sorry to post a problem for my first post but I'm in a bit of a pickle at the minute! I have an Ultra45 connected to a Storedge 3100 series, 2 internal, 2 external disks with a db application running on the external disks. Now everything is working fine and we've had no downtime or... (4 Replies)
Discussion started by: TheSteed
4 Replies

7. Solaris

Softpartition State: Errored in Command MetaStat

Hi people, I have on problem when execute the command METASTAT... d60: Soft Partition Device: d10 State: Errored Size: 12582912 blocks (6.0 GB) Someone help me? Thank you very much (4 Replies)
Discussion started by: denisgomes
4 Replies

8. Filesystems, Disks and Memory

What should I do with this metastat result? metareplace everything?

I have to rescue the volumes in a soloris 7 machine. The metastat returns the following result: d6 is dying with iostat -nE returns Harderror 3, Transports error 9 Can some offer me some help? Thank you very much. (6 Replies)
Discussion started by: nickychung
6 Replies

9. Solaris

SVM - metastat - Last Erred

My company is running a solaris 2.7 machine. The machine is getting slow recently. I have no expert in solaris. Please help. I checked the log in /var/adm/message: I also checked with the command iostat -nE. It returns: Metastat returns the followings: What should i do now?... (10 Replies)
Discussion started by: nickychung
10 Replies

10. Solaris

Help on metastat

hi all, can someone pls pass on your suggestion? Firs thing I am testing a script which checks for the pattern 'Needs Maintenance' from metastat output and prints some messages in the screen. So i need to simulate an error in mirrored disk for metastat to give this message 'Needs Maintenance'.... (3 Replies)
Discussion started by: srirammad007
3 Replies
Login or Register to Ask a Question