Unix/Linux Go Back    


Solaris The Solaris Operating System, usually known simply as Solaris, is a Unix-based operating system introduced by Sun Microsystems. The Solaris OS is now owned by Oracle.

Bad disk, how to replace ?

Solaris


Reply    
 
Thread Tools Search this Thread Display Modes
    #1  
Old Unix and Linux 1 Week Ago   -   Original Discussion by solaris_1977
solaris_1977's Unix or Linux Image
solaris_1977 solaris_1977 is offline
Registered User
 
Join Date: Mar 2011
Last Activity: 22 June 2018, 11:18 AM EDT
Posts: 464
Thanks: 57
Thanked 4 Times in 4 Posts
Bad disk, how to replace ?

Hello,

I see hard and transport errors on all disks under treso pool and looks like some data corruption too. I want to take backup before, I reboot and replace disk. As of now, there are no slots free on server, so one option is, to break mirror, remove second disk (I need two disks, because data is 400GB). I have two spare disks, will insert in those slots, mount and copy data.
Can somebody help me to understand, if below setup shows me that I can detach disks without disturbing data and mount ?


Code:
pool: treso
 state: DEGRADED
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: resilver completed after 0h42m with 0 errors on Thu Mar 24 12:11:13 2016
config:

        NAME        STATE     READ WRITE CKSUM
        zones2      DEGRADED    17     0     0
          raidz1    DEGRADED    17     0     0
            c1t4d0  ONLINE       0     0     0
            c1t5d0  DEGRADED    35     0     0  too many errors
            c1t6d0  ONLINE       0     0     0
            c1t8d0  FAULTED      2     0     0  too many errors

errors: 4 data errors, use '-v' for a list
#

Thanks
Sponsored Links
    #2  
Old Unix and Linux 1 Week Ago   -   Original Discussion by solaris_1977
Peasant's Unix or Linux Image
Peasant Peasant is offline Forum Advisor  
Registered User
 
Join Date: Mar 2011
Last Activity: 23 June 2018, 5:58 AM EDT
Posts: 1,189
Thanks: 32
Thanked 363 Times in 313 Posts
In current configuration, you will can do little..
Reason being your configuration (RAIDZ1), allows one disk to fail (which it did).

Other being almost failed, pool is still accessible.
When the degraded disk fails (should happen soon enough), you will lose all the data in zpool.

The course of action should be :
  1. Take a backup using zfs send / receive or copy the data.
  2. zpool offline the FAILED disk from pool.
  3. Unconfigure the offlined disk using cfgadm
  4. Insert a new working drive in the same slot, and configure it using cfgadm
  5. Issue a zpool online / replace against the replaced disk.

https://docs.oracle.com/cd/E19253-01...cet/index.html

Regards
Peasant.

Last edited by rbatte1; 1 Week Ago at 06:14 AM.. Reason: Formatted numbered list with LIST=1 tags
The Following User Says Thank You to Peasant For This Useful Post:
rbatte1 (1 Week Ago)
Sponsored Links
    #3  
Old Unix and Linux 1 Week Ago   -   Original Discussion by solaris_1977
solaris_1977's Unix or Linux Image
solaris_1977 solaris_1977 is offline
Registered User
 
Join Date: Mar 2011
Last Activity: 22 June 2018, 11:18 AM EDT
Posts: 464
Thanks: 57
Thanked 4 Times in 4 Posts
I took the backup, destroyed pool, replace disks and created new pool - zones3
Now, instead of putting in raidz1, I just want to create mirror of zones3. With below configuration, if one disk fails, data will be lost. I have two new disks- c1t4d0 and c1t6d0


Code:
# zpool status zones3
  pool: zones3
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        zones3      ONLINE       0     0     0
          c1t9d0    ONLINE       0     0     0
          c1t10d0   ONLINE       0     0     0

errors: No known data errors
#

Is it correct command to run ?


Code:
zpool zones3 mirror c1t4d0 c1t6d0

    #4  
Old Unix and Linux 1 Week Ago   -   Original Discussion by solaris_1977
Peasant's Unix or Linux Image
Peasant Peasant is offline Forum Advisor  
Registered User
 
Join Date: Mar 2011
Last Activity: 23 June 2018, 5:58 AM EDT
Posts: 1,189
Thanks: 32
Thanked 363 Times in 313 Posts
Take the following example, where i'm using files but it's the same with real devices.
This will tolerate 1 to 2 device failures.

If two devices fail from one top level vdev (mirror-N) you will lose data.

I would strongly suggest using odd number of disks and keeping one hot spare in pool.
In your configuration, get one more disk if you really love your data.



Code:
[root@gimmick ~]# ls -dl /zones/test/disk*
-rw------T   1 root     root     104857600 Jun 16 02:48 /zones/test/disk0
-rw------T   1 root     root     104857600 Jun 16 02:48 /zones/test/disk1
-rw------T   1 root     root     104857600 Jun 16 02:48 /zones/test/disk2
-rw------T   1 root     root     104857600 Jun 16 02:48 /zones/test/disk3
[root@gimmick ~]# 

[root@gimmick ~]# zpool status testpool

  pool: testpool
 state: ONLINE
  scan: none requested
config:

	NAME                 STATE     READ WRITE CKSUM
	testpool             ONLINE       0     0     0
	  /zones/test/disk1  ONLINE       0     0     0
	  /zones/test/disk0  ONLINE       0     0     0

errors: No known data errors
[root@gimmick ~]# zpool attach testpool /zones/test/disk0 /zones/test/disk2
[root@gimmick ~]# zpool attach testpool /zones/test/disk1 /zones/test/disk3
[root@gimmick ~]# zpool status testpool
  pool: testpool
 state: ONLINE
  scan: resilvered 49K in 0h0m with 0 errors on Sat Jun 16 02:48:41 2018
config:

	NAME                   STATE     READ WRITE CKSUM
	testpool               ONLINE       0     0     0
	  mirror-0             ONLINE       0     0     0
	    /zones/test/disk1  ONLINE       0     0     0
	    /zones/test/disk3  ONLINE       0     0     0
	  mirror-1             ONLINE       0     0     0
	    /zones/test/disk0  ONLINE       0     0     0
	    /zones/test/disk2  ONLINE       0     0     0

errors: No known data errors

[root@gimmick  ~]#

Hope that helps
Regards
Peasant.
Sponsored Links
    #5  
Old Unix and Linux 1 Week Ago   -   Original Discussion by solaris_1977
solaris_1977's Unix or Linux Image
solaris_1977 solaris_1977 is offline
Registered User
 
Join Date: Mar 2011
Last Activity: 22 June 2018, 11:18 AM EDT
Posts: 464
Thanks: 57
Thanked 4 Times in 4 Posts
Going through your example, can I run below commands online, without interruption ?
PHP Code:
zpool attach zones c1t9d0 c1t4d0
zpool attach zones c1t10d0 c1t6d0 
Sponsored Links
    #6  
Old Unix and Linux 1 Week Ago   -   Original Discussion by solaris_1977
Peasant's Unix or Linux Image
Peasant Peasant is offline Forum Advisor  
Registered User
 
Join Date: Mar 2011
Last Activity: 23 June 2018, 5:58 AM EDT
Posts: 1,189
Thanks: 32
Thanked 363 Times in 313 Posts
Yes.

Only thing that you should notice is increased read / write until resilvering is done.

Regards
Peasant.
The Following User Says Thank You to Peasant For This Useful Post:
solaris_1977 (5 Days Ago)
Sponsored Links
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Linux More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
AIX lpar bad disk I/O performance - 4k per IO limitation ? frenchy59 AIX 1 08-13-2016 06:43 AM
Bad magic number in disk label. SHuKoSuGi Solaris 40 08-29-2015 01:40 AM
Help:"Bad checksum in disk label" and "Can't open disk label package"? Resadija Solaris 6 03-29-2010 12:38 PM
Can't Mount Disk / Image after bad unmount Cranie OS X (Apple) 1 06-13-2009 06:19 PM
Replace, fromat, label bad harddrive Kevin1166 UNIX for Advanced & Expert Users 1 03-05-2009 10:36 AM



All times are GMT -4. The time now is 11:50 AM.