03-09-2009
I've had a similar experience, see below. The last contributor suggested that the problem may not be caused by the hard disk itself, but rather by an external device. In my case all servers have a cd drive. Could it be that under certain circumstances a cd/dvd could trigger this error on the hard disk.
The 3 messages listed below are from /var/adm/messages at the time the T2000 server hung. The same error occured on several T2000 servers over a period of 6 weeks. The submirrors d16 and d17 of mirror d15 were in a state of needs maintenance. The slice 0 of one disk was somehow corrupt and a simple metareplace wasn't sufficient to bring the system back up.
I also would really like to find the route cause of this error
<date> <hostname> scsi_status=0, ioc_status=8043, scsi_state=0
<date> <hostname> scsi: [ID 107833 kern.notice] Device is gone
<date> <hostname> md_stripe: [ID 641072 kern.warning] WARNING: md: d17: write error on /dev/dsk/c0t3d0s6
<date> <hostname> 1 md_mirror: [ID 104909 kern.warning] WARNING: md: d17: /dev/dsk/c0t3d0s6 needs maintenance
The servers have in the meantime been updated to the firmware listed below, also the problem has not reoccurred since then but I still dont know what lies behind the error.
System Firmware 6.6.4 Netra[TM] T2000 2008/07/01 02:01
---------------------------------------------------------
ALOM-CMT v1.6.4 Jun 6 2008 16:52:02
VBSC 1.6.4.a Jun 6 2008 05:19:51
OBP 4.28.9 2008/06/30 21:26
Hypervisor 1.6.4 2008/06/06 04:57
Netra[TM] T2000 POST 4.28.6 2008/05/23 12:34
10 More Discussions You Might Find Interesting
1. Solaris
Hello,
I am trying to do mirror in solaris 9. I have total 0-7 disks
4 5 6 7
0 1 2 3
Drive 0 and Drive 4 = Boot Drives
Need to Mirror following drives.
Drive 1 and Drive 5 = Need to mirror
Drive 1 was mounted on: /prod1, /prod2, /prod3, /prod4, /prod5.
Then i... (3 Replies)
Discussion started by: deal732
3 Replies
2. Solaris
solaris os run on X86.
bootdisk have been mirrored.but there is one disk miss,I want to replace the fail disk,but after I replace the disk,the system can't boot!
pls help! (2 Replies)
Discussion started by: netxjman
2 Replies
3. Solaris
I installed Solaris 10 on this Dell 5150 with only 1 SATA hard drive setup, all went well, and I could view the disk in the disk management window.
However, I setup a 2nd hard drive, identical to 1st drive. Solaris wont recognize it and gives an error when trying to view disks in disk management... (5 Replies)
Discussion started by: Joncamp
5 Replies
4. UNIX for Dummies Questions & Answers
Hi Guys,
I'm looking for a way to monitor disk health/status for a Solaris 5.8 sparc machine. I'm looking for something similar to LSIutility or MegaCLI.
Any suggestions?
Output of `modinfo`:
30 102616fb 10be8 118 1 ssd (SCSI SSA/FCAL Disk Driver 1.151)
122 7821c000 18550 32 1 ... (2 Replies)
Discussion started by: tank126
2 Replies
5. Solaris
Need a procedure document to do "root disk mirroring in solaris volume manager for solaris 10". I hope some one will help me asap. I need to do it production environment.
Let me know if you need any deatils on this.
Thanks,
Rama (1 Reply)
Discussion started by: ramareddi16
1 Replies
6. Solaris
Hi all,
I guess most of you have seen this error message while installing Solaris 10 on an x86 platform. I got the error message while installing from DVD ISO on my home VMWare ESXi server.
It took me a long time to figure the exact issue and a subsequent solution. the solution is very... (1 Reply)
Discussion started by: admin_xor
1 Replies
7. Solaris
I have a solaris 10 system configured using NetApp as its storage, and the file systems are already configured as you can see from the example below:
root@moneta # df -h
Filesystem size used avail capacity Mounted on
/dev/md/dsk/d0 9.8G 513M 9.3G 6% /
... (0 Replies)
Discussion started by: fretagi
0 Replies
8. Solaris
I have a solaris 10 system configured using NetApp as its storage, and the file systems are already configured as you can see from the example below:
root@moneta # df -h
Filesystem size used avail capacity Mounted on
/dev/md/dsk/d0 9.8G 513M 9.3G 6% /... (4 Replies)
Discussion started by: fretagi
4 Replies
9. Solaris
Hi all,
OS is Solaros 10 Sparc
While doing Netbackup upgradation to 7.5 , the server was asked to reboot.
But then it came up in single user mode,
and after I typed format command it showed some disk error.
bash-3.00# format
Searching for disks...WARNING:... (2 Replies)
Discussion started by: manalisharmabe
2 Replies
10. Solaris
Hello,
We are running solaris 8 and there is a disk having problem with consistency
The following devices (soft partitions from NetApp are inconsistent, unable to be repaired with fsck and need to be restored:
root@server1 # df -k
Filesystem kbytes used ... (4 Replies)
Discussion started by: gull05
4 Replies
LEARN ABOUT LINUX
metareplace
metareplace(1M) System Administration Commands metareplace(1M)
NAME
metareplace - enable or replace components of submirrors or RAID5 metadevices
SYNOPSIS
/usr/sbin/metareplace -h
/usr/sbin/metareplace [-s setname] -e mirror component
/usr/sbin/metareplace [-s setname] mirror component-old component-new
/usr/sbin/metareplace [-s setname] -e RAID component
/usr/sbin/metareplace [-s setname] [-f] RAID component-old component-new
DESCRIPTION
The metareplace command is used to enable or replace components (slices) within a submirror or a RAID5 metadevice.
When you replace a component, the metareplace command automatically starts resyncing the new component with the rest of the metadevice.
When the resync completes, the replaced component becomes readable and writable. If the failed component has been hot spare replaced, the
hot spare is placed in the available state and made available for other hot spare replacements.
Note that the new component must be large enough to replace the old component.
A component may be in one of several states. The Last Erred and the Maintenance states require action. Always replace components in the
Maintenance state first, followed by a resync and validation of data. After components requiring maintenance are fixed, validated, and
resynced, components in the Last Erred state should be replaced. To avoid data loss, it is always best to back up all data before replacing
Last Erred devices.
OPTIONS
Root privileges are required for all of the following options except -h.
-e Transitions the state of component to the available state and resyncs the failed component. If the failed component has
been hot spare replaced, the hot spare is placed in the available state and made available for other hot spare replace-
ments. This command is useful when a component fails due to human error (for example, accidentally turning off a disk), or
because the component was physically replaced. In this case, the replacement component must be partitioned to match the
disk being replaced before running the metareplace command.
-f Forces the replacement of an errored component of a metadevice in which multiple components are in error. The component
determined by the metastat display to be in the ``Maintenance'' state must be replaced first. This option may cause data to
be fabricated since multiple components are in error.
-h Display help message.
-s setname Specifies the name of the diskset on which metareplace will work. Using the -s option will cause the command to perform its
administrative function within the specified diskset. Without this option, the command will perform its function on local
metadevices.
mirror The metadevice name of the mirror.
component The logical name for the physical slice (partition) on a disk drive, such as /dev/dsk/c0t0d0s2.
component-old The physical slice that is being replaced.
component-new The physical slice that is replacing component-old.
RAID The metadevice name of the RAID5 device.
EXAMPLES
Example 1: Recovering from Error Condition in RAID5 Metadevice
This example shows how to recover when a single component in a RAID5 metadevice is errored.
# metareplace d10 c3t0d0s2 c5t0d0s2
In this example, a RAID5 metadevice d10 has an errored component, c3t0d0s2, replaced by a new component, c5t0d0s2.
Example 2: Use of -e After Physical Disk Replacement
This example shows the use of the -e option after a physical disk in a submirror (a submirror of mirror d11, in this case) has been
replaced.
# metareplace -e d11 c1t4d0s2
Note: The replacement disk must be partitioned to match the disk it is replacing before running the metareplace command.
EXIT STATUS
The following exit values are returned:
0 Successful completion.
>0 An error occurred.
ATTRIBUTES
See attributes(5) for descriptions of the following attributes:
+-----------------------------+-----------------------------+
| ATTRIBUTE TYPE | ATTRIBUTE VALUE |
+-----------------------------+-----------------------------+
|Availability |SUNWmdu |
+-----------------------------+-----------------------------+
SEE ALSO
mdmonitord(1M), metaclear(1M), metadb(1M), metadetach(1M), metahs(1M), metainit(1M), metaoffline(1M), metaonline(1M), metaparam(1M),
metarecover(1M), metarename(1M), metaroot(1M), metaset(1M), metassist(1M), metastat(1M), metasync(1M), metattach(1M), md.tab(4), md.cf(4),
mddb.cf(4), md.tab(4), attributes(5), md(7D)
Solaris Volume Manager Administration Guide
SunOS 5.10 8 Aug 2003 metareplace(1M)