DiskSuite: Breaking mirrors.


 
Thread Tools Search this Thread
Operating Systems Solaris DiskSuite: Breaking mirrors.
# 15  
Old 12-28-2005
i doubt that /var was ever mounted ro as that would mean that the system would have complained way lot earlier as the system does write to /var/adm/messages fairly regularly even if it doesn't need to write to /var/adm/wtmpx ... i'm convinced that the filesystem had some corruption that gave it a problem booting up ... btw, what was the last thing that happened to the server prior to it having the boot problem? was it patched and then rebooted? if patched, was the server in single-user mode or without regular user logins/processes at all during the patching?

Last edited by Just Ice; 12-28-2005 at 01:18 PM..
# 16  
Old 12-28-2005
Quote:
Originally Posted by Just Ice
i doubt that /var was ever mounted ro as that would mean that the system would have complained way lot earlier as the system does write to /var/adm/messages fairly regularly even if it doesn't need to write to /var/adm/wtmpx ...
The read only response was after the system came up in maintenance mode.

Quote:
i'm convinced that the filesystem had some corruption that gave it a problem booting up ...
Absolutely. I didn't start the troubleshooting on the system. It was handed over to me after the first guy had to go home. It was at the maintenance prompt when I got it. They were unable to Ctrl-Brk to the open boot prom since Chris (the H&E guy) couldn't find the ctrl or brk keys. I picked it up there.

The error I had was along the lines of (it's been two weeks) "can't create utmp".

Attempting to umount /var returned the mount point busy message. Chris hit enter on fsck before I could tell him to type in the specific file system I wanted to check. When it got to /var, it responded that /var was mounted read only.

Quote:
btw, what was the last thing that happened to the server prior to it having the boot problem? was it patched and then rebooted? if patched, was the server in single-user mode or without regular user logins/processes at all during the patching?
According to the team lead, an application (a cisco management package) was being upgraded which required a reboot after it was done.

Carl
# 17  
Old 12-28-2005
Quote:
Originally Posted by Just Ice
i doubt that /var was ever mounted ro
While I haven't seen this on Solaris, I have seen /var remounted as read-only on linux boxes. Happened to two of our systems in the past couple of months and I have one in the state right now. Apparently it's a symptom of a disk getting ready to go south.

Code:
Linux$ sudo ls -la
Password:
sudo: Can't open /var/run/sudo/carls/0: Read-only file system
collect: Cannot write ./dfjBSI0WES028034 (bfcommit, uid=51, gid=51): Read-only file system
queueup: cannot create queue file ./qfjBSI0WES028034, euid=51: Read-only file system

But that's linux. Smilie

Carl
# 18  
Old 12-28-2005
Quote:
Originally Posted by Just Ice
i doubt that /var was ever mounted ro as that would mean that the system would have complained way lot earlier as the system does write to /var/adm/messages fairly regularly even if it doesn't need to write to /var/adm/wtmpx ... i'm convinced that the filesystem had some corruption that gave it a problem booting up ... btw, what was the last thing that happened to the server prior to it having the boot problem? was it patched and then rebooted? if patched, was the server in single-user mode or without regular user logins/processes at all during the patching?
Actually, that's not quite right. A system where var (or even root if var is on root disk) is ro then the first complaint will be that message indicated by BOHF, if you have a spare test box try it and see. This somtimes does happen even after an fsck has completed and the command

remount -o rw,remount /var

is required to remount it read-write, i think BOHF has already said that he tried this, it might appear that a reboot should fix this but it is not always the case. I have seen this message many times but it has never that I can recall been the precursor to a disk failure nor even left the system unrecoverable.
# 19  
Old 12-29-2005
Whats the problem with running an fuser to see what process is holding onto the /var filesystem?
# 20  
Old 12-30-2005
Quote:
Originally Posted by reborg
Actually, that's not quite right. A system where var (or even root if var is on root disk) is ro then the first complaint will be that message indicated by BOHF, if you have a spare test box try it and see.
i didn't say anything to the contrary ... all i said was it is highly unlikely that the /var filesystem was ever intentionally mounted ro --- as in somebody edited /etc/vfstab then rebooted the box ...

Quote:
This somtimes does happen even after an fsck has completed and the command

remount -o rw,remount /var

is required to remount it read-write, i think BOHF has already said that he tried this, it might appear that a reboot should fix this but it is not always the case. I have seen this message many times but it has never that I can recall been the precursor to a disk failure nor even left the system unrecoverable.
i agree that the ro filesystem error in itself does not necessarily mean that the drive is going to undergo disk failure or that the system is unrecoverable --- however --- experience tells me that a persistent ro filesystem error even after several successful fsck runs may be a symptom of a corrupted filesystem ... and since disk errors can also contribute to filesystem errors, i would not disqualify a failing disk from the picture that quickly when i start troubleshooting ...

Quote:
Originally Posted by BOFH
According to the team lead, an application (a cisco management package) was being upgraded which required a reboot after it was done.
i just googled this and found it interesting how a cisco software application that just sits on the server gets to corrupt the filesystem ... Smilie
# 21  
Old 12-30-2005
Quote:
i agree that the ro filesystem error in itself does not necessarily mean that the drive is going to undergo disk failure or that the system is unrecoverable --- however --- experience tells me that a persistent ro filesystem error even after several successful fsck runs may be a symptom of a corrupted filesystem
Agreed, however since the filesystem was mounted ro, I didn't understand how was a successful fsck ever run. That was really the point I was getting at: to run the fsck.
Login or Register to Ask a Question

Previous Thread | Next Thread

6 More Discussions You Might Find Interesting

1. Solaris

Zpool with 3 2-way mirrors in a pool

I have a single zpool with 3 2-way mirrors ( 3 x 2 way vdevs) it has a degraded disk in mirror-2, I know I can suffer a single drive failure, but looking at this how many drive failures can this suffer before it is no good? On the face of it, I thought that I could lose a further 2 drives in each... (4 Replies)
Discussion started by: fishface
4 Replies

2. Solaris

Oneway mirrors

All, One-way mirror. Elements of the concat in Last-errd state. What would be the best way to correct it? metastat -s db2test -pc db2test/d220 p 5.0GB db2test/d200 db2test/d219 p 5.0GB db2test/d200 db2test/d218 p 5.0GB db2test/d200 db2test/d217 p 30GB db2test/d200... (0 Replies)
Discussion started by: ossupport55
0 Replies

3. Solaris

Help with attaching mirrors

Hi Guys, I need a help with attaching the sub mirrors as it keep throwing errors. I have done solaris live upgrade and it was succesful but it keeps throwing error only for root (s0) and swap (s1)when i try to attach them. For rest of the partitions for slices 3,4,5 on target 1 are able to... (4 Replies)
Discussion started by: phanidhar6039
4 Replies

4. Linux

[Errno 256] No more mirrors to try.

Dear all, CentOS 6 After executing "yum update -y" command I am facing this error. Please help me out. thanks in advance. Full error & error code is given as follow: ... (7 Replies)
Discussion started by: saqlain.bashir
7 Replies

5. Linux

Additional mirrors on centos

How can I add additional mirrors to my CENTOS distro, according to this page AdditionalResources/Repositories - CentOS Wiki there are few fedora project repositories I'd like to add any of them but I don't know how? Thank you in advance (0 Replies)
Discussion started by: c0mrade
0 Replies

6. Solaris

both mirrors in needs maintenance state.

Hi, Ii am facing the belwo problem: d50: Mirror Submirror 0: d30 State: Needs maintenance Submirror 1: d40 State: Needs maintenance Pass: 1 Read option: roundrobin (default) Write option: parallel (default) Size: 212176648 blocks (101 GB) d30:... (3 Replies)
Discussion started by: sag71155
3 Replies
Login or Register to Ask a Question