Quote:
Originally Posted by Perderabo
That's not the end of the road. Use fuser to find out why. lsof would be better, but it probably won't be there in single user mode.
Isn't wtmpx the most likely issue here? If the system needs wtmp and it's holding on so I can't umount, how is knowing that going to help? And it might not be wtmpx, it could be that I should have been able to umount /var and didn't realize that. When I couldn't umount it or remount it rw (which I also tried), I figured it wasn't possible and moved on to breaking mirrors.
Can you really force a umount? Something I'll have to investigate.
But I will keep fuser in mind for next time
Quote:
That policy is fine for the first few hundred questions. Then what? How many more questions will there be? A 1,000 more? 10,000 more? Do you abort an fsck that might have had one more question to ask? Is it safe to abort fsck and restart it with -y? A few sessions like this and that "fsck -n" starts to look pretty good. "fsck -n" will only take a few seconds if everythinkg is ok. And it can always be safely aborted since it doesn't change anything. But if you don't like "fsck -n", my other trick is jamming the "y" key down with a paper wad.
You can use fsck -y
Seriously though, if there are 1000 questions, what are you going to do by answering 'n' to all of them except know that there are a bleeding lot of them? Especially when there's some guy on the other side of the phone who has absolutely no idea what's scrolling off the console or even worse, if I'm tipping in and can't interrupt. Do I go away for coffee then and come back in 2 hours?
I figure that after 20 or so 'y's, there might be something even more serious going on and I may have to consider newfs'ing the slice and copying over the apparently good slice, or hoping there's a good backup somewhere.
And another couple of data points that I didn't think important for the initial question:
1) The keyboard wasn't a Sun keyboard. It was, I believe, an MVS type keyboard. Two rows of function keys along the top as near as I could understand. Chris couldn't find a break key and the control key was labeled something else. We had to use Ctrl+[ to generate the necessary escape sequence in vi since there wasn't an escape key either.
2) Also, the console connection was somewhat flaky. Kept throwing odd characters on the screen from time to time and locked the console up so that it had to be power cycled to recover.
Really though. In spite of all the other questions and suggestions, the real problem was, why didn't breaking the mirror succeed? Why couldn't I disassemble the mirror and bring the system back up on a single disk in the manner I described?
Right now, mirroring is one of this company's backup schemes so this is important.
1. Tivoli for the paying customers.
It's interesting that user crontabs continue to function after a user's password expires on AIX and Red Hat, but stops processing on Solaris. This was a cause of backup failures on our Sun boxes. (This is the responsibility of another department so we don't get whacked.)
2. Mirrors to ensure system availability in the event of a failed disk.
Unless of course, we're unable to boot to the "good" disk for some reason.
3. flarchive for a jumpstart recovery if the system can't be mirrored.
Currently 2 and 3 are mutually exclusive. If the system isn't mirrored, we use flarchive. Not enough disk for flarchives of all the systems.
So is there a step I missed? Something else I should have tried? Granted, there might have been a simple problem in vfstab. I'll check that and check it on the other systems. I'll even post the vfstab here if it'll help. It sounds like I should have been able to umount /var. If true, I'll try harder to do that when I purposly break one of the lab boxes as a test.
Carl