How to manually -re-attach AIX lv's to a mirror?

Login or Register to Ask a Question and Join Our Community

How to manually -re-attach AIX lv's to a mirror?

Tags

aix, lv, lvm, mirror

Login to Discuss or Reply to this Discussion in Our Community

Operating Systems AIX How to manually -re-attach AIX lv's to a mirror?

01-14-2019

Registered User

89, 2

Join Date: Oct 2009

Last Activity: 20 October 2020, 12:36 PM EDT

Posts: 89

Thanks Given: 5

Thanked 2 Times in 2 Posts

How to manually -re-attach AIX lv's to a mirror?

in trying to rectify a stale lv problem I ran rmlvcopy <lv> 1 <primary disk> leaving the original os disk without lv copies other than the stale lv.
Both disks seem operational, but, lsvg rootg shows 1 stale pv.

The end goal is to re-attach the lv's back to hdisk1, and then attempt a reboot off of hdisk1 to sync things up again.

Code:

#lsvg -l rootvg
rootvg:
LV NAME             TYPE       LPs     PPs     PVs  LV STATE      MOUNT POINT
hd5                 boot       1       1       1    closed/syncd  N/A
hd6                 paging     4       4       1    open/syncd    N/A
hd8                 jfs2log    1       1       1    open/syncd    N/A
hd4                 jfs2       60      60      1    open/syncd    /
hd2                 jfs2       40      80      2    open/stale    /usr   <== not sure why the lv is in stale mode!
hd9var              jfs2       16      16      1    open/syncd    /var
hd3                 jfs2       20      20      1    open/syncd    /tmp
hd1                 jfs2       40      40      1    open/syncd    /home
hd10opt             jfs2       40      40      1    open/syncd    /opt
hd11admin           jfs2       1       1       1    open/syncd    /admin
livedump            jfs2       1       1       1    open/syncd    /var/adm/ras/livedump
lvol1               jfs2       60      60      1    open/syncd    /usr/sys/inst.images

Code:

# ls -m hd2          <== StaLE LV
LP    PP1  PV1               PP2  PV2               PP3  PV3
0001  0222 hdisk1            0509 hdisk0            
0002  0229 hdisk1            0510 hdisk0            
0003  0230 hdisk1            0511 hdisk0            
0004  0231 hdisk1            0512 hdisk0            
0005  0232 hdisk1            0513 hdisk0            

# lslv -m hd1
hd1:/home
LP    PP1  PV1               PP2  PV2               PP3  PV3
0001  0585 hdisk0            
0002  0586 hdisk0            
0003  0587 hdisk0            
0004  0588 hdisk0            
0005  0589 hdisk0

Last edited by jim mcnamara; 01-14-2019 at 02:46 PM..

This User Gave Thanks to mrmurdock For This Post:

mrmurdock

View Public Profile for mrmurdock

Find all posts by mrmurdock

01-14-2019

Registered User

6,384, 2,214

Join Date: May 2005

Last Activity: 28 October 2019, 4:59 PM EDT

Location: In the leftmost byte of /dev/kmem

Posts: 6,384

Thanks Given: 143

Thanked 2,214 Times in 1,548 Posts

Quote:

Originally Posted by mrmurdock

in trying to rectify a stale lv problem I ran rmlvcopy <lv> 1 <primary disk> leaving the original os disk without lv copies other than the stale lv.
Both disks seem operational, but, lsvg rootg shows 1 stale pv.

The end goal is to re-attach the lv's back to hdisk1, and then attempt a reboot off of hdisk1 to sync things up again.

Let us first establish what "stale" means here. Bear with me if this is old news for you: when you have a mirrored LV (basically there are only mirrored LVs, a mirrored VG means just that all LVs are mirrored) each LP (logical partition) is represented by two different PPs (physical partition). An LV is considered stale if any of its LPs is not represented by two (or three, depending on the number of mirrors) PPs.

If the mirroring is recreated (that happens in the background) all the LVs that are not completely mirrored yet are marked "stale" too. Check for a processes named syncvg in the process list. If it is there you just need to wait. You can also check the output of lsvg rootvg to see if the number of stale LPs decrease.

Furthermore, your OS disk does not only contain VG information but also is also instrumented to be booted from. Whenever you alter (the disks of) your rootvg you need to reestablish the boot code by using the bosboot command - this puts theboot code onto the disk and thus makes it bootable. Furthermore you may need to alter the bootlist by (re-)creating it with the bootlist command. I just wanted to say this up front because it is easily forgotten once in a while.

Back to remedying your situation: the first thing you should do is to make absolutely sure you have a valid, working and installable backup, preferably in form of an mksysb image, most preferably on your NIM server. However far from ideal your current situation is: take the time to create such an image before you try anything else. Whenever you do non-trivial tasks to your rootvg you run a non-zero chance of ending with a non-working system. With an image you can at least get where you have been. If you know your trade you can run mkszfile before running mksysb and then edit this file to create a non-mirrored backup image. Normally the image will be restored the same way the system was installed when the image was taken, with all the mirrors, etc. in place. It may be preferable to have the image been taken in an unmirrored fashion so that it restores without a mirror on one single disk and only then do a mirrorvg manually. Again: don't forget bosboot and bootlist afterwards.

Which brings us to your disk: you probably have isolated the culprit to the one LV you still have on it after you removed all the other ones. Do an unmirrorvg to completely make the disk empty and a reducevg to get it out of the VG. Your VG should now be in unmirrored but otherwise healthy state. If you want you could now extensively test the disk and eventually reuse it but i wouldn't. The gain of what a single disk costs is simply not worth the effort it takes to reinstall a system that crashed because of a failing disk, not to mention the costs of the downtime of the service provided by the system itself. Get a new disk, put it in, do an extendvg and finally a mirrorvg. After you issued the mirrorvg command it takes some time until the mirrors are resynchronized. Until that the LVs are still shown to be "stale".

To speed up things (and if you have enough RAM because that takes some of it) you can do like i usually do:

Code:

mirrorvg -s rootvg hdiskN     # mirrors but does NOT start to synchronise
syncvg -P:32 -v rootvg &      # mirrors in background, syncing 32 LPs in parallel

Notice that 32 is the maximum. Use less if you have not enough RAM. The needed amount is the PP-size (times the number). You can also set a certain number of parallel tasks in advance by putting into /etc/environment the following line:

Code:

NUM_PARALLEL_LPS=NN

This will also affect HACMP/PowerHA commands, unlike the same setting in roots profile, which are ignored. Also notice that activatevg and varyonvg will (re-)start the synchronisation process too if the VG has stale partitions.

I hope this helps.

bakunin

Last edited by bakunin; 01-14-2019 at 03:49 PM..

These 2 Users Gave Thanks to bakunin For This Post:

bakunin

View Public Profile for bakunin

Find all posts by bakunin

01-14-2019

Registered User

89, 2

Join Date: Oct 2009

Last Activity: 20 October 2020, 12:36 PM EDT

Posts: 89

Thanks Given: 5

Thanked 2 Times in 2 Posts

Thank you.
I may not be in as bad of shape as I think i am. lslv -l hd2 shows hdisk1 with a 0% in the IN BAND column, which from the man pages sounds like the OS is not writing to the lv anymore.

Code:

# lslv -l hd2
hd2:/usr
PV                COPIES        IN BAND       DISTRIBUTION  
hdisk1            040:000:000   0%            001:039:000:000:000 
hdisk0            040:000:000   100%          000:000:040:000:000

Code:

 synclvodm  rootvg

returns with no errors.
It almost seems like i could just pull hdisk1 out and be ok at this point. Its a gut wrenching decision (probably wont do it though). I have re-ran bosboot -ad /dev/hdisk0 and made sure my bootlist lists hdisk0 first. If there was trouble as far as os problems, I would expect my OS by now to be choking and dying if I had any filesystem access, os command errors, accessing hdisk0, however, its still running fine (running DB2 and Informix developement DB's).

mrmurdock

View Public Profile for mrmurdock

Find all posts by mrmurdock

01-14-2019

Registered User

6,384, 2,214

Join Date: May 2005

Last Activity: 28 October 2019, 4:59 PM EDT

Location: In the leftmost byte of /dev/kmem

Posts: 6,384

Thanks Given: 143

Thanked 2,214 Times in 1,548 Posts

Quote:

Originally Posted by mrmurdock

I may not be in as bad of shape as I think i am. lslv -l hd2 shows hdisk1 with a 0% in the IN BAND column, which from the man pages sounds like the OS is not writing to the lv anymore.

Code:

# lslv -l hd2
hd2:/usr
PV                COPIES        IN BAND       DISTRIBUTION  
hdisk1            040:000:000   0%            001:039:000:000:000 
hdisk0            040:000:000   100%          000:000:040:000:000

Sorry, but: no. The "in band" means something completely different and has nothing to do with your problem. "In band" means: when you create LVs they are placed fittingly on the disk so that there is no place for extension. Like this, where a,b,c... mean the PPs of various LVs and X means free PPs:

Code:

aaaaabbccccccXXXXXXXX.....

Now, when you extend LVs or shrink them you over time end in a situation where this strict succcession is broken up, like this:

Code:

aaaccbbccccccaacbaXXXX.....

The initial situation is what is meant by "in band 100%": all the LVs are physically placed in one piece and the PPs are in the order of ascending LPs. Once your disk becomes more and more disorganised you can rectify this with the reorgvg command which moves around all the PPs until they are in order again. In your case the "in band 100%" comes from all PPs assigned to hd2 are placed on the "center" part of hdisk0 but on hdisk1 39 of the 40 are placed on "outer middle" and one is placed on "outer edge". Therefore the "in band" indicator shows 0%. But again, this has nothing to do with your problem.

Quote:

Originally Posted by mrmurdock

Code:

 synclvodm  rootvg

returns with no errors.

This just means that the information in the ODM about the composition of the rootvg is accurate. This is a good thing but still does not help your problem.

Quote:

Originally Posted by mrmurdock

It almost seems like i could just pull hdisk1 out and be ok at this point.

DON'T!!

As i said before the information about the VG is stored in the ODM and if you simply remove the disk (without using the reducevg procedure i explained above) you end with this information being NOT accurate any more. Prepare to manually repair the ODM in a rather tedious fashion afterwards if you do that. (Don't think you could put in another disk to make up: disks are identified by a unique "PVID" when they become part of a VG, so the system knows that this disk is not that disk.) Before you pull out the disk remove it cleanly from the ODM and this is done by using the commands i explained above.

I hope this helps.

bakunin

Last edited by bakunin; 01-14-2019 at 10:45 PM..

These 2 Users Gave Thanks to bakunin For This Post:

bakunin

View Public Profile for bakunin

Find all posts by bakunin

01-15-2019

Registered User

89, 2

Join Date: Oct 2009

Last Activity: 20 October 2020, 12:36 PM EDT

Posts: 89

Thanks Given: 5

Thanked 2 Times in 2 Posts

this morning (or maybe after a nights rest), revealed the issue from lspv . The lspv hdisk1 this morining also shows the pv state: missing, although lspv shows all the disks online. none of the aix lvm commands are working on the disk (reducevg complains about the open hd2 lv, which is /usr, even if I use -f to force it). syncvg is not running in the background.
This is AIX 6.1 TL7 SP 10 1415 build date. I have had to run odmgets and odmdeletes before on other boxes. a little bit of tedious cleanup isnt all that bad. Unfortunately this is in a remote DC, so I have to rely on another pair of hands to pull the disk.

Code:

PHYSICAL VOLUME:    hdisk1                   VOLUME GROUP:     rootvg
PV IDENTIFIER:      00f649e07720beb9 VG IDENTIFIER     00f7382000004c000000015706543652
PV STATE:           missing                                    
STALE PARTITIONS:   38                       ALLOCATABLE:      yes
PP SIZE:            256 megabyte(s)          LOGICAL VOLUMES:  1
TOTAL PPs:          1117 (285952 megabytes)  VG DESCRIPTORS:   2
FREE PPs:           1077 (275712 megabytes)  HOT SPARE:        no
USED PPs:           40 (10240 megabytes)     MAX REQUEST:      1 megabyte
FREE DISTRIBUTION:  223..184..223..223..224                    
USED DISTRIBUTION:  01..39..00..00..00                         
MIRROR POOL:        None

AND ERRPT shows (finally)
Description
PV NO LONGER RELOCATING NEW BAD BLOCKS

Probable Causes
NON-MEDIA ERROR DURING SW RELOCATION

Failure Causes
DISK DRIVE
DISK DRIVE ELECTRONICS
STORAGE DEVICE CABLE

mrmurdock

View Public Profile for mrmurdock

Find all posts by mrmurdock

01-15-2019

Registered User

6,384, 2,214

Join Date: May 2005

Last Activity: 28 October 2019, 4:59 PM EDT

Location: In the leftmost byte of /dev/kmem

Posts: 6,384

Thanks Given: 143

Thanked 2,214 Times in 1,548 Posts

Quote:

Originally Posted by mrmurdock

(reducevg complains about the open hd2 lv, which is /usr, even if I use -f to force it).

That was to be expected. I repeat:

Quote:

Do an unmirrorvg to completely make the disk empty and a reducevg to get it out of the VG.

You cannot use a reducevg on a disk which has not been emptied before. Since you have still a LV occupying space on the PV (even if it is only a mirror) you cannot remove the disk from the VG. You either have to remove the mirror on this disk first or move it to another PV.

If this is not due to a broken cable or controller (that would explain the "missing" status) the error message suggests that the disk was in its last throes anyway: when a disk is formatted (when included in a VG) a certain number of blocks is set aside to compensate for blocks going bad. They are used up over time. Once they are depleted (or nearly depleted) you usually see a series of TEMP hdisk errors (IIRC "hdisk error 3", usually stretched out over some days or weeks) before finally a PERM (IIRC "hdisk error 4") one in the errpt.

I hope this helps.

bakunin

Last edited by bakunin; 01-15-2019 at 06:46 PM..

bakunin

View Public Profile for bakunin

Find all posts by bakunin

01-15-2019

Registered User

89, 2

Join Date: Oct 2009

Last Activity: 20 October 2020, 12:36 PM EDT

Posts: 89

Thanks Given: 5

Thanked 2 Times in 2 Posts

so migratepv -l hd2 hdisk1 hdisk6 (yes I found a spare unused disk allocated, but I can delete the lv and vg on it.

). My only concern would be since it cannot read the bad block to finish the mirror, is migratepv smart enough to move the stuck lv? I guess if migratepv cant, it will just error out.

mrmurdock

View Public Profile for mrmurdock

Find all posts by mrmurdock

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

AIX - mirror a jfs2log

Hi everybody, I have a little problem with my AIX 6.1, PowerHA 6.1 LVM mirroring. I accidentally created logical volume cpsabcd2lv with external jfs2log loglv00 in the same volume group cpsdata2vg. Then I mirrored LV cpsabcd2lv on the second LUN in VG cpsdata2vg. My journal is unmirrored and...

2. AIX

AIX hdisk Mirror vs alt_clone

Hello, I have two hdisk in Power7 machine, the rootvg on hdisk0. So to make a disk redundancy should make mirror or alt_clone and what is the different. Appreciate your help Thanks

3. AIX

AIX break rootvg mirror from system down

Hello, aix 5.2, mirrored rootvg on hdisk0 and hdisk1. hdisk0 is dead. I can boot to cd, into sms, into maintenance mode. I can fsck all the various partitions on hdisk1 (the hd4 hd2 hd3, etc...) all is fine. But without the hdisk0 part of the mirror I cannot get the system to boot. ystem hangs on...

4. HP-UX

What is the difference between DRD and Root Mirror Disk using LVM mirror ?

what is the difference between DRD and Root Mirror Disk using LVM mirror ?

5. AIX

Clone or mirror your AIX OS larger disk to smaller disk ?

hello folks, I have a 300GB ROOTVG volume groups with one filesystem /backup having 200GB allocated space Now, I cannot alt disk clone or mirrorvg this hdisk with another smaller disk. The disk size has to be 300GB; I tried alt disk clone and mirrorvg , it doesn't work. you cannot copy LVs as...

6. AIX

Attach HP EVA to IBM AIX powerpc singlepath

Dear all. We have a very big issue on Attach HP EVA to IBM AIX powerpc singlepath. the configurations on lscfg -vl fcs2 fcs2 U789C.001.DQD8D74-P1-C2-T1 4Gb FC PCI Express Adapter (df1000fe) Part Number.................10N7249 Serial...

7. AIX

AIX Rootvg mirror and sysdumplv

Guys, In my AIX 6.1 box the rootvg was on hdisk2, I tried to migrated it to hdisk0 Added hisk0 to rootvg , mirrored rootvg and changed bootlist and and sucessfully rebooted from hdisk0 Now I tried to remove the hdisk2 from rootvg so breaked mirror -bash-3.00# unmirrorvg rootvg hdisk2...

8. Solaris

What is mirror and sub mirror in RAID -1 SVM

Hi , I am new to SVM .when i try to learn RAID 1 , first they are creating two RAID 0 strips through metainit d51 1 1 c0t0d0s2 metainit d52 1 1 c1t0d0s2 In the next step metainit d50 -m d51 d50: Mirror is setup next step is metaattach d50 d52 d50 : submirror d52 is...

9. Solaris

ZFS Mirror versus Hardware Mirror

I've looked a little but haven't found a solid answer, assuming there is one. What's better, hardware mirroring or ZFS mirroring? Common practice for us was to use the raid controllers on the Sun x86 servers. Now we've been using ZFS mirroring since U6. Any performance difference? Any other...

10. UNIX for Dummies Questions & Answers

Display Mirror State AIX

Hello, how can i see easily the state of a mirrored disk on a AIX 4.3.3. I try followed command: lslv -m >lvname> but for me is not enough information. thanx in advance fenomen

Login or Register to Ask a Question