Visit Our UNIX and Linux User Community


hacmp in a 7 node configuration ?


 
Thread Tools Search this Thread
Operating Systems AIX hacmp in a 7 node configuration ?
# 1  
Old 08-28-2009
hacmp in a 7 node configuration ?

Hi Guys,

I have to design a multinode hacmp cluster and am not sure if the design I am thinking of makes any sense.
I have to make an environment that currently resides on 5 nodes more resilient but I have the constrain of only having 4 frames. In addition the business doesnt want to pay for more lpars than they have to. So I would like to go for a 5 active + 2 spare cluster with following setup:

Quote:
F1: A1 + A2
F2: A3 + S1
F3: A4 + S2
F4: A5

A1 could failover to S1
A2 could failover to S2
A3 could failover to S2
A4 could failover to S1
A5 could failover to S1 or S2 (this is the most important environment).
A1 has higher need in resources so I would give S1 the same amount of resources A1 has and all others need about half as much resources, so S2 would have their amount.

Do I have to make all storage (about 3-5 TB / lpar) visible to all nodes or just to the nodes that are intended to takeover the resources.

Does this sound sensible or do I forget anything in this setup? Does anyone maybe run a similar setup and can share his experiences?

I appreciate your comments.

Kind regards
zxmaus

Last edited by zxmaus; 08-28-2009 at 04:41 AM..
# 2  
Old 08-28-2009
Hm not sure if HACMP will allow this kind of konfiguration. But tbh with clusters I would try to keep it as simple as possible so maybe have 1 backup for each of them. I know that this is no great help, but I would stay with 2 node clusters for each of the applications. No idea what kind of applications if HACMP + Oracle RAC is an option etc. or if you can consolidate any of the apps/prod nodes to get more LPARs as backup nodes.
# 3  
Old 08-28-2009
How many locations are involved (servers and SAN)?
# 4  
Old 08-28-2009
Hi Shockneck,

all is in the same datacentre and will be served by the same SAN infrastructure - EMC Symmetrix with Powerpath - Raid5 on the SAN site, dedicated Adapter pairs on the lpar site - no vio except for management purposes.

According to SAN Engineering no problem to make the disks all visible on each node required - even visibility on all 7 nodes is not a problem according to them.

This proposal is for one site only - we will have similar setups for UAT and COB in another datacentre in another country - with database replication between PROD and COB.

Kind regards
zxmaus
# 5  
Old 08-28-2009
Quote:
Do I have to make all storage (about 3-5 TB / lpar) visible to all nodes or just to the nodes that are intended to takeover the resources.
You'll have to make storage visible for prod nodes and backup nodes of course so they can takeover in case.
So zoning and masking should assure following:

S1 can see disks of A1, A4, A5
S2 can see disks of A2, A3, A5.

But I don't know if a node can be in more than 1 cluster which would be the case for S1 and S2. I guess Shockneck will tell Smilie

Last edited by zaxxon; 08-28-2009 at 10:17 AM..
# 6  
Old 08-28-2009
Hi Zaxxon,

all these nodes are planned to be in the same cluster - I will have one more cluster as DR solution and one more cluster as uat - all of them have 7 lpars each if I am done Smilie

I have just spoken to the SAN guys - they dont see any problems making all 7 nodes see the same disks ... one problem less Smilie

Kind regards
Nicki
# 7  
Old 08-30-2009
Quote:
Originally Posted by zxmaus
[...]
I have to design a multinode hacmp cluster [...] I have to make an environment that currently resides on 5 nodes more resilient but I have the constrain of only having 4 frames. [...] So I would like to go for a 5 active + 2 spare cluster with following setup:
Code:
F1: A1 + A2
F2: A3 + S1
F3: A4 + S2
F4: A5

A1 could failover to S1
A2 could failover to S2
A3 could failover to S2
A4 could failover to S1
A5 could failover to S1 or S2 (this is the most important environment).

A1 has higher need in resources so I would give S1 the same amount of resources A1 has and all others need about half as much resources, so S2 would have their amount.

Do I have to make all storage (about 3-5 TB / lpar) visible to all nodes or just to the nodes that are intended to takeover the resources.
[...]
Current HACMP versions support clusters up to 32 nodes. Hence a combination of five active and two passive nodes is supported. However, from my point of view setting up this cluster is not the problem but operating it. HACMP clusters work great in a well defined environment that is thoroughly tested before going live and that is not changed after. However even in a stable environment there is administration work to do. Naturally it needs more effort to keep all software and microcodes on the same (latest) level on seven nodes. Different levels within a cluster are supported for a short time during node upgrade only (i.e. one or two days). It will also be more complicated to develop a sensible scenario for testing because a lot of different problems can be combined. For the same reason it will be more complicated to bring the cluster back to normal work during desaster recovery.
So for reasons of keeping the daily operation and the desaster recovery simple a cluster with less nodes is prefereable. For the same reason many clusters use a clear active-passive design as Zaxxon hinted. This is just what one ought to be aware of, I don't mean to hold you back. Personally I'd happily run such a seven node cluster under one condition: only trained HACMP admins have the root password. If your DBAs are allowed root on the cluster nodes - forget it.

Now to the technical details. A cluster node cannot be member of two different clusters. The Resource Group definitions are used to control which nodes are being used by which application. So within the cluster your A1 Resource Group would use node A1 and S1, A5's RG would use A5, S1 and S2 in this order. While you define RG's nodes in any order you like keep in mind that in a split brain condition nodes are sortet by their names alphanumericly to decide which node to power off. If you node's names are like A1, S1 and so on that should not turn out to be a problem.

I don't know by heart whether it is possible to use different zones to restrict access of shared VG disks to certain nodes. Cluster LVM operations require a PV being visible and accessible in any cluster node usually. If this presupposition is just for nodes that belong to the Resource Group you could use zones but otherwise all cluster's shared disks need to be visible in all cluster nodes. If using zones was possible that would however lead to hdisk numbers being duplicated while content was different. IMHO that would be a source of confusion. To avoid that I'd assign the disks one after the other to the seven nodes to see the same LUN with the same hdisk number in every node. So all hdiskpower devices have there reserve_lock set to no. (In December 08 I discovered a nasty bug in EMC powerpath that should have been fixed by March or April 09. The Box said that the reserve was off while in reality it was on. Your SAN colleagues probably are aware of that bug but anyway make sure you use the latest powerpath fixlevel!). You use Enhanced Concurrent Mode VGs and the Cluster is in control of who accesses which disk.

You are going to have non TCP/IP networks. These are point to point connections and you need to think about whether you want a heartbeat ring (over all frames) or one or more stars(from frame to frame). Probably you are going to use heartbeat over disk and thus use disk heartbeat devices. If you do you can use the data ECM VG for the heartbeat but you might think about using (very small e.g. 1PP) dedicated disk devices for that. While those disks (LUNs) use the same EMC box it might
make things more easy to see what is going on in the cluster plus the heartbeat works even while the Resource Group is offline. If you intend to use RS232 heartbeat you will very likely end up with a hearbeat ring.

These comments just for the points you mentioned so far. There is probably much more to think about and problems might arise during implementation.

Previous Thread | Next Thread
Test Your Knowledge in Computers #897
Difficulty: Medium
BusyBox is a package containing multiple binaries.
True or False?

10 More Discussions You Might Find Interesting

1. AIX

Crash dump and Panic message : RSCT Dead Man Switch Timeout for HACMP; halting non-responsive node

Dear all i have two aix system -Model : P770 -OS version: AIX 6.1 -patch level : 6100-07-04-1216 -ha version : HACMP v 6.1.0.8 -host : A, B last Wednesday, my B system suddenly went down with crash dump. after 1 minute, A system went down with crash dump. I checked the dump of A-system... (6 Replies)
Discussion started by: tomato00
6 Replies

2. AIX

HEA configuration on managed node.

Folks, Please have a look to the attached screenshot from my managed node's HEA configuration option page. I would like to know - what does "Flow Control Enabled" checkbox help us with if opted for? Thanks! -- Souvik (3 Replies)
Discussion started by: thisissouvik
3 Replies

3. AIX

How to check if HACMP is running on AIX node?

Hello AIX experts, I have few queries and appreciate if you could help me with them. 1. How to check if HACMP (or any other AIX OS cluster) is installed 2. How to check if HACMP (or any other AIX OS cluster) is running 3. how to check which Oracle DB instance is running on it 4. how to... (1 Reply)
Discussion started by: prvnrk
1 Replies

4. AIX

HACMP two-node cluster with two SAN storages mirrored using LVM

HACMP two-node cluster with mirrored LVM. HACMP two-node cluster with two SAN storages mirrored using LVM. Configured 2 disk heartbeat networks - 1 per each SAN storage. While performing redundancy tests. Once one of SAN storage is down - cluster is going to ERROR state. What are the guidelines... (2 Replies)
Discussion started by: OdilPVC
2 Replies

5. AIX

Should GPFS be configured before/after configuring HACMP for 2 node Cluster?

Hi, I have a IBM Power series machine that has 2 VIOs and hosting 20 LPARS. I have two LPARs on which GPFS is configured (4-5 disks) Now these two LPARs need to be configured for HACMP (PowerHA) as well. What is recommended? Is it possible that HACMP can be done on this config or do i... (1 Reply)
Discussion started by: aixromeo
1 Replies

6. AIX

Service IP issue in HACMP, Working with only one node

Hi, I have a running HACMP Cluster with two nodes. Its working in active/passive mode. (i.e Out of the two nodes in the cluster one will be active and the other one will be on standby. If first node fails the other takes over) 1. There is a Service IP associated with the cluster. Now the... (2 Replies)
Discussion started by: aixromeo
2 Replies

7. AIX

HACMP does not start db2 after failover (db2nodes not getting modified by hacmp)

hi, when I do a failover, hacmp always starts db2 but recently it fails to start db2..noticed the issue is db2nodes.cfg is not modified by hacmp and is still showing primary node..manually changed the node name to secondary after which db2 started immediately..unable to figure out why hacmp is... (4 Replies)
Discussion started by: gkr747
4 Replies

8. AIX

HACMP 5.4.1 Two-Node-Cluster-Configuration-Assistant fails

This post just as a follow-up for thread https://www.unix.com/aix/115548-hacmp-5-4-aix-5300-10-not-working.html: there was a bug in the clcomdES that would cause the Two-Node-Cluster-Configuration-Assistant to fail even with a correct TCP/IP adapter setup. That affected HACMP 5.4.1 in combinatin... (0 Replies)
Discussion started by: shockneck
0 Replies

9. AIX

Which files can be deleted in /usr in an aix hacmp node?

Hi Can ony one advise which files can be deleted in /usr in an aix hacmp node ? Im new to aix and Im not sure which files can be deleted ? #df -g /usr Filesystem GB blocks Free %Used Iused %Iused Mounted on /dev/hd2 7.00 1.00 86% 67013 22% /usr ... (4 Replies)
Discussion started by: samsungsamsung
4 Replies

10. AIX

Node Switch Reasons in HACMP

Hi Guys, I have two nodes clustered. Each node is AIX 5.2 & they are clustered with HACMP 5.2. The mode of the cluster is Active/Passive which mean one node is the Active node & have all resource groups on it & the 2nd node is standby. Last Monday I noted that all resource groupes have been... (2 Replies)
Discussion started by: aldowsary
2 Replies

Featured Tech Videos