zfs raidz2 - insufficient replicas

12-20-2011

Registered User

3, 0

Join Date: Dec 2011

Last Activity: 22 December 2011, 7:36 AM EST

Posts: 3

Thanks Given: 0

Thanked 0 Times in 0 Posts

zfs raidz2 - insufficient replicas

I lost my system volume in a power outage, but fortunately I had a dual boot and I could boot into an older opensolaris version and my raidz2 7 drive pool was still fine. I even scrubbed it, no errors. However, the older os has some smb problems so I wanted to upgrade to opensolaris11. I accidentally did the auto install option and it overwrote one of my data drives. This is really important data, but fortunately I have a raidz2, so how bad can it be?... BAD! Booting back into the earlier solaris version I cannot access my pool, This is what zpool status shows:

zpool status
pool: brick3
state: UNAVAIL
scrub: none requested
config:

Code:

    NAME        STATE     READ WRITE CKSUM
    brick3      UNAVAIL      0     0     0  insufficient replicas
      raidz2    UNAVAIL      0     0     0  corrupted data
        c12d0   ONLINE       0     0     0
        c11d1   ONLINE       0     0     0
        c10d1   ONLINE       0     0     0
        c10d0   ONLINE       0     0     0
        c8t0d0  ONLINE       0     0     0
        c8t1d0  ONLINE       0     0     0
        c11d0   ONLINE       0     0     0

I tried to export and re-import it, and I get the following message:

zpool import brick3
cannot import 'brick3': invalid vdev configuration

please please please help, what do I do.

Last edited by Yogesh Sawant; 12-22-2011 at 03:13 AM.. Reason: added code tags

skk

View Public Profile for skk

Find all posts by skk

12-20-2011

Registered User

317, 60

Join Date: Jul 2011

Last Activity: 11 September 2018, 9:15 AM EDT

Posts: 317

Thanks Given: 7

Thanked 60 Times in 60 Posts

Post output from:

Code:

zpool status -x

dude2cool

View Public Profile for dude2cool

Find all posts by dude2cool

12-20-2011

Registered User

3, 0

Join Date: Dec 2011

Last Activity: 22 December 2011, 7:36 AM EST

Posts: 3

Thanks Given: 0

Thanked 0 Times in 0 Posts

I wanted to update this thread in case anyone is stupid enough to have this same problem. In desperation I installed the new opensolaris11 on a the right drive this time, and then I was able to import my pool in a degraded but functional state. Since this is a raidz2 pool I should be able to restore just fine. I don't know what opensolaris11 did to my pool, but then again I don't really care either if I can have my data back. Whew that was scary. I vow to back up everything really important. zfs is great but you can still loose everything in the blink of an eye.

skk

View Public Profile for skk

Find all posts by skk

12-22-2011

Registered User

3, 0

Join Date: Dec 2011

Last Activity: 22 December 2011, 7:36 AM EST

Posts: 3

Thanks Given: 0

Thanked 0 Times in 0 Posts

I thought I was OK, but zpool scrub hangs forever at 20% across multiple cold boots, and importing from oracle solaris 11 live hangs forever... I what do I do now, older versions of open solaris will not import the pool anymore...

---------- Post updated at 07:36 AM ---------- Previous update was at 12:48 AM ----------

Well it's 4AM and now I am getting mad. I think this whole bloody mess is caused by Oracle's 'great new solaris 11' package. My advice, don't touch this piece.

To recap, I wanted to try the best newest Solaris release for my super critical file server, so I downloaded from Oracle what I thought was the right iso. I booted, and just as the boot menu came up I got a phone call. When I came back the thing had automatically installed a new version of solaris over one of the drives in the zpool. (BAD #1 ).

Since I have a raidz2 I wasn't too worried ( at first ) and I booted into my original 2009 opensolaris. However, I got the errors shown in the above posts. I exported and could not re-import my pool. Oracale had done something to that drive breaking my entire pool even though it is a radz2 and should be tolerant of 2 drive failures at the same time. ( BAD #2 ).

Since I could do nothing with my pool with my old OS, I tried on the new Oracle solaris, and could indeed import my pool in degraded state because of the one overwritten drive. Fine, I wanted to scrub everything first ( I don't know if this was wise or not ) so I did zpool scrub, which eventually hung forever at 11%. All access to the drive was similarly hung. Rebooting the machine did not change this situation which seemed increasingly dire ( BAD #3 )

I finally got out of this problem by unplugging drives to fault the pool and rebooting in single user mode. Eventually I was able to stop the scrub with the "zpool scrub -s" command, in single user mode. And when I rebooted, I could access my pool again. My first priority at this point was to back up all data immediately. I began to copy off my most important stuff, but unfortunately before I could copy even a fraction off the file system hung again ( BAD #4 ).

Googling around I found most causes for hanging zpool commands are related to hardware failure, so going on a hunch, I figured Oracle phased out or screwed up the drivers for my disks. I still could not import my pool in the old opensolaris OS, because of whatever the Oracle install wrote on that drive. So I booted in Oracle solaris, in single user mode, and did "zpool offline <pool> <drive>" and it worked! Then I rebooted into good old opensolaris, and imported my pool. It worked!!!!

So at this point it is now more like 5AM and I have backed up most of my critical data, way more than I could before anyway. It appears at this point that I was correct and the drivers for either my motherboard or my hard drive controller card were broken by the Oracle release in a way that let it silently trash my zpool. I have a SIIG 2 drive sata and a MSI n1996 motherboard, not sure which the problem is with, but whichever, it works fine in opensolaris 2009 and previous versions.

I just want to warn people that are not real Solaris experts from even trying this Oracle package. Personally I am migrating to fbsd as soon as I can...

skk

View Public Profile for skk

Find all posts by skk

Solaris

zfs raidz2 - insufficient replicas

9 More Discussions You Might Find Interesting

1. Solaris

13 disk raidz2 pool lost

Discussion started by: tatxo

2. Solaris

Solaris Volume Manger - Database Replicas Question - Benefits of Increasing Default Size?

Discussion started by: Keepcase

3. Solaris

metadevices: how to test metadb (how to corrupt replicas/understanding replicas)

Discussion started by: deadeyes

4. UNIX for Advanced & Expert Users

Issue with insufficient swap or memory space

Discussion started by: kavithakuttyk

5. UNIX for Advanced & Expert Users

insufficient available memory

Discussion started by: big123456

6. Solaris

Problem with DiskSuite replicas

Discussion started by: bonovox

7. Solaris

insufficient metadevice database replicas ERROR

Discussion started by: mr_manny

8. HP-UX

Insufficient permissions on ftp'ed files from WIN2K to HP-UX

Discussion started by: Anamika

9. UNIX for Advanced & Expert Users

SDS and replicas

Discussion started by: christophe