[Howto] Update AIX in HACMP cluster-nodes | Unix Linux Forums | AIX

  Go Back    


AIX AIX is IBM's industry-leading UNIX operating system that meets the demands of applications that businesses rely upon in today's marketplace.

[Howto] Update AIX in HACMP cluster-nodes

AIX


Tags
aix, unix

Closed Thread    
 
Thread Tools Search this Thread Display Modes
    #1  
Old 03-09-2013
bakunin bakunin is offline Forum Staff  
Bughunter Extraordinaire
 
Join Date: May 2005
Last Activity: 24 July 2014, 3:28 AM EDT
Location: In the leftmost byte of /dev/kmem
Posts: 4,116
Thanks: 42
Thanked 765 Times in 603 Posts
[Howto] Update AIX in HACMP cluster-nodes

As i have updated a lot of HACMP-nodes lately the question arises how to do it with minimal downtime. Of course it is easily possible to have a downtime and do the version update during this. In the best of worlds you always get the downtime you need - unfortunately we have yet to find this best of worlds.

The following procedure is proven to work with AIX 5.3, 6.x and 7.x and associated HACMP/PowerHA versions. It needs only one takeover, so the downtime is from somewhere from under a minute to some minutes, depending on the nature of your resource group(s).

Communications in HACMP happens via RSCT and for a cluster to work the version of the RSCT-packages have to be in sync. Fortunately it is easy to update the RSCT independent of the rest of the OS. This is what this procedure depends on. We will consider a dual-node cluster with an active and a standby-system (rotating cluster), but the procedure can easily be adapted to other cluster-architectures.
  • Stop the clustermanager on the standby-node. This will end the cluster-communication. The remaining node will be on its own.

  • Update the RSCT-packages on both nodes. It won't matter that the communication path over the RSCT-daemons will be disrupted, because there is nobody to communicate with anyways.

  • Optional step: If you are of the well and truly paranoid type (like me) you can now restart the clustermanager on the standby-node and do a cluster-synchronization. I never experienced any problems when i tried this procedure in a test-environment and skipped this step, but i still feel better to do it when working on a PROD-system.

  • Stop the clustermanager on the standby-system again and update the rest of AIX and/or HACMP. Because you made sure the RSCT-daemons are already updated and at a equal version it won't do any harm if the versions of the other packages are different.

  • Once the standby-system has finished the update restart cluster-services and move the resource-group to the standby-system. This takeover will be your downtime.

  • Update now the remaining node after shutting down cluster-services. After the update finished restart cluster-services and do a cluster-synchronization. You are finished.

I hope this helps.

bakunin
Sponsored Links
    #2  
Old 03-11-2013
DGPickett DGPickett is offline Forum Advisor  
Registered User
 
Join Date: Oct 2010
Last Activity: 8 July 2014, 12:19 PM EDT
Location: Southern NJ, USA (Nord)
Posts: 4,378
Thanks: 8
Thanked 535 Times in 514 Posts
So, is the HACMP as a whole never down, just degraded to fewer nodes? It seems like with a HA cluster, you can update hosts in rotation and return them to the pool, so it is only down on host at a time.
Sponsored Links
    #3  
Old 03-11-2013
bakunin bakunin is offline Forum Staff  
Bughunter Extraordinaire
 
Join Date: May 2005
Last Activity: 24 July 2014, 3:28 AM EDT
Location: In the leftmost byte of /dev/kmem
Posts: 4,116
Thanks: 42
Thanked 765 Times in 603 Posts
Quote:
Originally Posted by DGPickett View Post
So, is the HACMP as a whole never down, just degraded to fewer nodes?
Yes. Exactly this is the point.

Quote:
Originally Posted by DGPickett View Post
It seems like with a HA cluster, you can update hosts in rotation and return them to the pool, so it is only down on host at a time.
Yes and no. The point is that the HA-communication is done via RSCT and the versions of the RSCT packages have to be consistent throughout the cluster at any time. This is why you have to split up the cluster into single nodes at one point (precisely the point where you update the RSCT). During this phase communication would not be possible. But as each node is single at this time it doesn't recognize this inability to communicate.

I hope this helps.

bakunin
    #4  
Old 03-12-2013
DGPickett DGPickett is offline Forum Advisor  
Registered User
 
Join Date: Oct 2010
Last Activity: 8 July 2014, 12:19 PM EDT
Location: Southern NJ, USA (Nord)
Posts: 4,378
Thanks: 8
Thanked 535 Times in 514 Posts
It is sad the HA version n+ cannot discover and talk to version n as well as, when available, version n+. Backward compatability is a pretty common theme in the industry for many decades. They were sloppy in their requirements? No message version in the messaging?
Sponsored Links
    #5  
Old 03-12-2013
MichaelFelt MichaelFelt is offline
Registered User
 
Join Date: Nov 2012
Last Activity: 11 December 2013, 7:33 AM EST
Location: on the road for work; home is private time
Posts: 311
Thanks: 6
Thanked 75 Times in 71 Posts
Quote:
Originally Posted by bakunin View Post
The following procedure is proven to work with AIX 5.3, 6.x and 7.x and associated HACMP/PowerHA versions. It needs only one takeover, so the downtime is from somewhere from under a minute to some minutes, depending on the nature of your resource group(s).

Communications in HACMP happens via RSCT and for a cluster to work the version of the RSCT-packages have to be in sync. Fortunately it is easy to update the RSCT independent of the rest of the OS.
... snip ...
I hope this helps.

bakunin
Looks good - however, have you also verified this with an update to SystemMirror (aka PowerHA v7?). As I understand it, SystemMirror is not using (only?) RSCT - but is using CAA (Cluster Aware AIX) for communication, topology and heartbeats. -- I do not do much with SystemMirror so I am asking - anyone - just to be sure someone does not get surprised when working with/updating to SystemMirror.
Sponsored Links
Closed Thread

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Make system backup for 2 nodes HACMP cluster h@foorsa.biz AIX 3 04-20-2011 02:00 AM
Run script on HACMP nodes? kalaso Shell Programming and Scripting 3 11-17-2010 07:38 AM
Rebooting 3 to 1 Cluster nodes. EmbedUX Emergency UNIX and Linux Support 4 05-27-2010 06:34 AM
Aix hacmp cluster question (oracle & sap) filosophizer AIX 4 02-01-2009 04:22 PM
Howto upgrade AIX to Level Update 4320-02_AIX_ML progressdll UNIX for Advanced & Expert Users 1 05-15-2002 09:42 AM



All times are GMT -4. The time now is 08:59 AM.