Quote:
as long as a system has a majority of its total number of replicas available, it will boot. It doesn't matter where these are, how many are on any one particular disk, etc., but obviously you want to spread these evenly around to make the configuration as resilient to disk failure as possible.
Exactly. I was more concerned with taking how I thought it was supposed to work vs how it was actually working. Seeing a single replica total not only raised a flag, it had the guy waving the flag jumping up and down and screaming
I have one system with a single replica database. Per the man page:
Quote:
If there is only one replica and the system crashes, it is possible that all metadevice configuration data can be lost.
As long as the replica is stable, the system will continue to function. If this replica bails, the system won't boot. However if there are two replicas and one goes south then we have a majority and the system will continue to run but it won't boot unless we delete the bad replica (assuming we can determine it's bad before rebooting).
The other problem, per the man page:
Quote:
The majority consensus algorithm accounts for the following: the system will stay running with exactly half or more replicas; the system will panic when less than half the replicas are available; the system will not reboot without one more than half the total replicas.
It appears that if the system crashes and only has two replicas and there's an inconsistancy, the system will not boot (needs
one more than half to boot).
So if the system has three replicas and two of them become unstable, the system won't boot. Based on the section I quoted, three is a minimum if there's any inconsistancy and more would be better, in odd numbers. So 5 sounds like it'd be the absolute best without getting silly. And if you have two disks, five replicas
on each disk would be the best as well. They only take about 512k per replica and the minimum size partition (on the 73 gig drive) is 10 megs.
I would determine based on the man pages that three would be the minimum and that five would be preferred.
man page from Sun
To answer my original questions, a system will continue to function with 1 or 2 replicas, however it's set up to fail as soon as there's an inconsistancy.
Quote:
All replicas on a system are the same, they are not associated with the specific volumes/mirrors or anything like that.
Yep, I knew that. It was the number that most bothered me, plus trying to remove the "required" from "suggested" and "highly suggested" in the man pages.
Quote:
If you are going to replace disks, I would first delete the state database replicas from that disk (medadb -d /dev/dsk/cXtXdXsX) so that you don't end up with less than a majority available on boot.
Technically if a disk is dead, you can't remove the replicas from that disk (it won't respond). Each replica also knows where the other replicas are. So when you run metadb, it's pulling the info from one of the available replicas. When you run metadb -f -d c1t1d0s7 (for example), you're removing the pointers from one of the replicas and the info is then disbursed to the others.
Lastly, to clear errored entries from the replicas databases, you have to delete the bad entry. Because you can't just add a single one back, the only way to replace it is to delete all the replicas from that slice and then recreate them.
Thanks Phil. I appreciate the answer.
Carl