metadevices: how to test metadb (how to corrupt replicas/understanding replicas)

07-28-2010

Registered User

8, 0

Join Date: Jul 2010

Last Activity: 23 November 2010, 6:52 PM EST

Posts: 8

Thanks Given: 2

Thanked 0 Times in 0 Posts

metadevices: how to test metadb (how to corrupt replicas/understanding replicas)

Hi all,

I recently started exploring Solaris 10.

I am testing metadevices now.
I have been reading about the state databases here: 6.State Database (Overview) (Solaris Volume Manager Administration Guide) - Sun Microsystems

So I created 3 metadbs on 2 slices (6 in total; c1t1d0s3 c1t1d0s5).
This is the setup:
c1t1d0s3 -> 3 replicas
c1t1d0s4 -> d1
c1t1d0s5 -> 3 replicas
c1t1d0s6 -> d2
d10 -> mirror of d1 and d2

After that I mounted this as /export/home and put it in vfstab.

Now I want to test what happens when a state db gets corrupt. So I used dd if=/dev/zero of=/dev/rdsk/c1t1d0s3.

First of all: is this a good way to test this?
When I type metadb -i all state databases seem ok. Does this command represents the current state of the meta dbs? I would think not.
Is there a possibility to see the current state? Are those state databases only used when mounting a filesystem? If it is only read when mounting then what is the difference between losing half and half+1 replicas as it has both the same meaning aka not being able to mount the filesystem?
/export/home is still mounted and I can still create files on it. Whenever I unmount, I cannot remount the partition. This is normal, right?

When I reboot the system does not goes to multiuser. Which is normal as it is stated in the aforementioned link. Then the replicas are shown as corrupt.
After deleting the replicas on c1t1d0s3 I still can't mount the metadevice. After a reboot, this does work. Is this normal? (I get messages about stale replicas)
Stale database: does this mean it is "corrupt" / "invalid" /...?

The second setup I tried:
c1t1d0s3 -> 4 replicas
c1t1d0s4 -> d1
c1t1d0s5 -> 3 replicas
c1t1d0s6 -> d2
d10 -> mirror of d1 and d2

Note the 4 replicas on c1t1d0s3.

So now when I corrupt 4 metadevices, the system should panic. As it is half+1 replicas.
After doing the dd command the machine does not panic. However that is what is mentioned on the aforementioned link :s I am confused here.
What am I doing wrong?

To recover from invalid/corrupt/... state replicas: should I just remove the old state replicas and create new ones and restart the machine?

From what I have read earlier I should create 3 replicas on each drive if I have 2 drives. I wonder why 3 and not 2. If I use 3 or 2 replicas doesn't seem to matter as in both situations when I loose half the replicas I will not be able to boot/mount. Or is this because when you delete the invalid/corrupt replicas you would have 2 and there should be a minimum of 3?
EDIT: "For a system with two to four drives: put two replicas on each drive." So seems I have misread.
However I wonder what is the difference as you still need to reboot after creating new replicas on the earlier failed drive. Otherwise you get a message about stale replicas.

I hope someone can give me some answers!

Kind regards

Last edited by deadeyes; 07-29-2010 at 05:07 AM..

deadeyes

View Public Profile for deadeyes

Find all posts by deadeyes

07-29-2010

Registered User

416, 3

Join Date: Sep 2008

Last Activity: 22 December 2011, 4:37 AM EST

Posts: 416

Thanks Given: 0

Thanked 3 Times in 3 Posts

Your state database is showing no issue reason being the below one has not executed

dd if=/dev/zero of=/dev/md/rdsk/c1t1d0s3

in above md is only used for virtual file system, and in above should receive an error message. 2nd point is for running of volume manager the minimum functioning database is needed is more then 50% +1 in your case if all the meatda file is on one partition, and if you corrupt the metadb then your entire filesystem is going to get corrupted.

you can use metadb –f for deletion and addition of state replica.

kumarmani

View Public Profile for kumarmani

Find all posts by kumarmani

07-29-2010

Registered User

8, 0

Join Date: Jul 2010

Last Activity: 23 November 2010, 6:52 PM EST

Posts: 8

Thanks Given: 2

Thanked 0 Times in 0 Posts

Quote:

Originally Posted by kumarmani

Your state database is showing no issue reason being the below one has not executed

dd if=/dev/zero of=/dev/md/rdsk/c1t1d0s3

Sorry that was a typo. I used /dev/rdsk/c1t1d0s3. And that returned no error.

Quote:

Originally Posted by kumarmani

2nd point is for running of volume manager the minimum functioning database is needed is more then 50% +1 in your case if all the meatda file is on one partition, and if you corrupt the metadb then your entire filesystem is going to get corrupted.

Do note that the meta dbs are on 2 different slices (I wanted to "simulate" having 2 hard drives). So in the first case only 50% of the replicas can be damaged and in the second case half+1 get damaged.
Also from the docs it tells that 50% of the replicas is enough to keep it running while half+1 is necessary to get your system booting again.

Quote:

Originally Posted by kumarmani

you can use metadb -f for deletion and addition of state replica.

The tools in itself are not the problem.
I want to test how the number of available replicas affects the system.
But for deleting half+1 replicas the system did not panic as mentioned in the docs. It is almost as it doesn't matter if there are 5 or 1 replicas available as the only problem I have seen happening is when I reboot the system or unmount the filesystem. (see example below with 7 replicas and overwriting 4 of them)

Thanks for your response!

deadeyes

View Public Profile for deadeyes

Find all posts by deadeyes

07-29-2010

Registered User

416, 3

Join Date: Sep 2008

Last Activity: 22 December 2011, 4:37 AM EST

Posts: 416

Thanks Given: 0

Thanked 3 Times in 3 Posts

I do not have system where i can test it at the moment, but sure there r Gurus here who would be at our help !!

kumarmani

View Public Profile for kumarmani

Find all posts by kumarmani

Solaris

metadevices: how to test metadb (how to corrupt replicas/understanding replicas)

10 More Discussions You Might Find Interesting

1. Solaris

Creating metadevices

Discussion started by: fretagi

2. Solaris

zfs raidz2 - insufficient replicas

Discussion started by: skk

3. Solaris

SVM - Metadevices are offline after changing hostname solaris x86

Discussion started by: karthick.sh

4. Solaris

Metadevices deleted

Discussion started by: karthick.sh

5. Solaris

Solaris Volume Manger - Database Replicas Question - Benefits of Increasing Default Size?

Discussion started by: Keepcase

6. Shell Programming and Scripting

Test on string containing spacewhile test 1 -eq 1 do read a $a if test $a = quitC then break fi d

Discussion started by: Max89

7. Solaris

Metadevices in mirroring ?

Discussion started by: aggadtech08

8. Solaris

Problem with DiskSuite replicas

Discussion started by: bonovox

9. Solaris

insufficient metadevice database replicas ERROR

Discussion started by: mr_manny

10. UNIX for Advanced & Expert Users

SDS and replicas

Discussion started by: christophe