Visit Our UNIX and Linux User Community


Solstice Disksuite Question


 
Thread Tools Search this Thread
Operating Systems Solaris Solstice Disksuite Question
# 1  
Old 08-24-2006
Solstice Disksuite Question

According to the metadb man page:

Quote:
Replicated databases have an inherent problem in determining
which database has valid and correct data. To solve this
problem, Volume Manager uses a majority consensus algorithm.
This algorithm requires that a majority of the database
replicas be available before any of them are declared valid.
This algorithm strongly encourages the presence of at least
three initial replicas, which you create. A consensus can
then be reached as long as at least two of the three repli-
cas are available. If there is only one replica and the sys-
tem crashes, it is possible that all metadevice configura-
tion data can be lost.
Now based on italicized part, if a mirror only has a single replica and never crashes, it should still function.

Same with booting. I thought three replicas were necessary for the system to be able to come up but the comment about crashing tells me that as long as the mirrors are stable (iow never crash), the system should come up.

The reason I ask this is that I've started a new position and discovered 8 disks in need of maintenance (metareplace), one disk out of sync (all partitions) and one disk with the status of Unknown.

In addition, I found several replicas that have failed (W - write errors and M - replica had problem with master blocks) and quite a few with less than 3 replicas on a disk (I prefer to configure a system with a mininum of 3 replicas per disk since disks are so large). At least one system had only a single replica database and 2 or 3 had two replica databases.

Per the comment in the man page, you need a majority (technically one is a majority) for the state information to be validated.

The problem is that systems are functioning correctly with one or two replicas and the one with a single replica was rebooted four days ago. Since it was booted, I have to believe that three replicas aren't required. In fact, as long as the replica db is stable, you can boot with a single replica (based on evidence).

So:

How does a system with only two replicas still manage to operate correctly? I can logically see how one will work (majority) but only two should fail.

Are my thoughts on a single replica and the system not crashing accurate? Or at least logical Smilie


My next steps appear to be to get the bad disks replaced (easy enough) but also to get the replicas in order. For the systems with 1 or 2 replicas, I think I'm going to have to break the mirrors and rebuild with what I think is the correct number of replicas (some of the disks have 5 replicas per disk).

Thoughts? Pointers to more technical information that what's on docs.sun.com?

Thanks.

Carl
# 2  
Old 08-26-2006
Carl,

as long as a system has a majority of its total number of replicas available, it will boot. It doesn't matter where these are, how many are on any one particular disk, etc., but obviously you want to spread these evenly around to make the configuration as resilient to disk failure as possible.

All replicas on a system are the same, they are not associated with the specific volumes/mirrors or anything like that.

If you are going to replace disks, I would first delete the state database replicas from that disk (medadb -d /dev/dsk/cXtXdXsX) so that you don't end up with less than a majority available on boot.
# 3  
Old 08-26-2006
Quote:
as long as a system has a majority of its total number of replicas available, it will boot. It doesn't matter where these are, how many are on any one particular disk, etc., but obviously you want to spread these evenly around to make the configuration as resilient to disk failure as possible.
Exactly. I was more concerned with taking how I thought it was supposed to work vs how it was actually working. Seeing a single replica total not only raised a flag, it had the guy waving the flag jumping up and down and screaming Smilie

I have one system with a single replica database. Per the man page:

Quote:
If there is only one replica and the system crashes, it is possible that all metadevice configuration data can be lost.
As long as the replica is stable, the system will continue to function. If this replica bails, the system won't boot. However if there are two replicas and one goes south then we have a majority and the system will continue to run but it won't boot unless we delete the bad replica (assuming we can determine it's bad before rebooting).

The other problem, per the man page:

Quote:
The majority consensus algorithm accounts for the following: the system will stay running with exactly half or more replicas; the system will panic when less than half the replicas are available; the system will not reboot without one more than half the total replicas.
It appears that if the system crashes and only has two replicas and there's an inconsistancy, the system will not boot (needs one more than half to boot).

So if the system has three replicas and two of them become unstable, the system won't boot. Based on the section I quoted, three is a minimum if there's any inconsistancy and more would be better, in odd numbers. So 5 sounds like it'd be the absolute best without getting silly. And if you have two disks, five replicas on each disk would be the best as well. They only take about 512k per replica and the minimum size partition (on the 73 gig drive) is 10 megs.

I would determine based on the man pages that three would be the minimum and that five would be preferred.

man page from Sun

To answer my original questions, a system will continue to function with 1 or 2 replicas, however it's set up to fail as soon as there's an inconsistancy.

Quote:
All replicas on a system are the same, they are not associated with the specific volumes/mirrors or anything like that.
Yep, I knew that. It was the number that most bothered me, plus trying to remove the "required" from "suggested" and "highly suggested" in the man pages.

Quote:
If you are going to replace disks, I would first delete the state database replicas from that disk (medadb -d /dev/dsk/cXtXdXsX) so that you don't end up with less than a majority available on boot.
Technically if a disk is dead, you can't remove the replicas from that disk (it won't respond). Each replica also knows where the other replicas are. So when you run metadb, it's pulling the info from one of the available replicas. When you run metadb -f -d c1t1d0s7 (for example), you're removing the pointers from one of the replicas and the info is then disbursed to the others.

Lastly, to clear errored entries from the replicas databases, you have to delete the bad entry. Because you can't just add a single one back, the only way to replace it is to delete all the replicas from that slice and then recreate them.

Thanks Phil. I appreciate the answer.

Carl

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Solaris

Solstice disksuite - mirror sync direction

Hi I have an existing mirror configured (d10) with submirrors (d11 and d12). I recently detached the d12 mirror for an upgrade. I know want to re-attach the d12 mirror to sync with the active mirror (d11). When I run the command metattach d10 d12 - which way will the sync occur? Will the... (1 Reply)
Discussion started by: samruthroy
1 Replies

2. Solaris

Disksuite question

Hello all, I have a Solaris Disksuite question :- I will be adding 4 new drives to an E250 server and need will be configuring 2 striped volumes each consisting 2 new disks with SVM. In the end i will have 2 volumes each of 72gb. So in effect i will have 1 volume called D7 and another volume... (6 Replies)
Discussion started by: commandline
6 Replies

3. Solaris

Solstice Disk Suite

I am doing disk mirroring and this command I entered: metainit d202 1 1 c0t1d0s0 gives me this error "metainit: d202: no such file or directory" The nmd value in /kernel/drv/md.conf is set to 2000. Any ideas what this error means? (3 Replies)
Discussion started by: bluridge
3 Replies

4. Solaris

Solstice 6.0.3 - Recover: Cannot Start Session

Hi people, I have a Solstice Backup 6.0.3 installed in server X1 and one installed in server X2. I need to start de recover in X2 and the somes files existing in X1. - Well, i begin this process in X2 with #nwadmin -s X1 & and i select "Save Set/ Recover". I select the datas and click in... (0 Replies)
Discussion started by: denisgomes
0 Replies

5. Solaris

Looking for Solstice DiskSuite 4.2

Hi all, Do you know where I can download Soltice Disksuite 4.2 for Solaris 2.6 ? I haven't the CD labeled “Solaris Server Intranet Extensions 1.0” . Thanks in advance for your precious help. Bests Regards Hosni (2 Replies)
Discussion started by: hosni
2 Replies

6. Solaris

DiskSuite 4.2.1 Database creation question

I'm trying to figure out how to simply create a 500Gb ufs file system named /rec using DiskSuite 4.2.1 on a Solaris 8 platform via command line using two 250Gb partitions on two different disks. Does anyone know what the proper command string and options are to perform this task? Would I use the... (2 Replies)
Discussion started by: ruger14
2 Replies

7. Solaris

Solstice DiskSuite

Has anybody every used Solstice DiskSuite? I am having trouble setting it up. I installed it without a problem, but do I really have to blow away the drives on the D1000 just to create a metastate database? (8 Replies)
Discussion started by: hshapiro
8 Replies

8. UNIX for Dummies Questions & Answers

How to remove Veritas Volume Manager 3.5 and install Solstice DiskSuite

I am brand new to UNIX and have been given the task to remove veritas volume manager 3.5 mirroring and install Disksuite mirror on two Solaris 5.8 servers. Does anyone know where I can find step by step instructions to perform these tasks? Thanks (1 Reply)
Discussion started by: mg2
1 Replies

9. UNIX for Dummies Questions & Answers

sun solstice, admintool, smc

hi, i tried these sun command on a sun0s 5.7 but get the "cant open display" message. please advise. i am using netterm to telnet. (5 Replies)
Discussion started by: yls177
5 Replies

10. UNIX for Dummies Questions & Answers

Solstice DiskSuite linear RAID Howto

Dear all i search a linear RAID Howto for the solaris os. Wiht the help of google and docs.sunc.com i found nothing... can you share your knowlage wiht me? thanx adme (1 Reply)
Discussion started by: adme
1 Replies

Featured Tech Videos