Cluster node not starting


 
Thread Tools Search this Thread
Operating Systems AIX Cluster node not starting
# 1  
Old 06-08-2013
Cluster node not starting

Setting up HACMP 6.1 on a two node cluster. The other node works fine and can start properly on STABLE state (VGs varied, FS mounted, Service IP aliased). However, the other node is always stuck on ST_JOINING state. Its taking forever and you can't stop the cluster as well or recover from script failure. I can't see any error from hacmp.out.

Here's the latest error I see from clstrmgr.debug (This is from console so I just type it hereSmilie

Code:
getPriorityOverride: Returning 0 for the nodehandle:2
getPriorityOverrideSecondary: Returning 0 for the nodehandle:2
rm_CreateAllPolMsg: node NODE02 has group RG1 in node 1
getPriorityOverride: Reutrning 0 for the nodehandle:4
getPriorityOverrideSecondary: Returning 0 for the nodehandle:4
Before Sending: Message Length is 4232   NumResStates:2NumPols:2  numSSitePols:0  join_data_valid:0
rm_ProcessnPhaseCb: Voting to CONTINUE my join w/msg.seq_no1 packet_count:

---------- Post updated at 03:52 AM ---------- Previous update was at 03:47 AM ----------

Also one thing to add is that, I can start the cluster on any node as long as I have not started any other node. Meaning I can start the cluster and RG on either node1 and node2 but If I start it on node1, node2 won't bring up by clstart and shows as ST_JOINING forever. Thus, I cannot do a failover to other node unless the other node is in stable state.

Last edited by Scrutinizer; 06-08-2013 at 06:22 AM.. Reason: code tags
# 2  
Old 06-09-2013
Not sure about the reason without further information, but to me this looks like a communication problem. Check all the cluster networks (see "cllsif") for connectivity and mak sure the disk-heartbeat works as expected.

Another possible reason which comes to mind is the VG: make sure it is varied on in "enhanced concurrent" mode. Maybe there are disk reservations left over somehow: issue a "varyonvg -b -u" to break disk reservations.

I hope this helps.

bakunin
# 3  
Old 06-09-2013
It feels like the synchronization was performed - "forced". As Bakunin mentions, it is most likely a communication problem.

Basic test - start all nodes but do not start a resource groups. I suspect you will not be able to start all nodes.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Solaris

SVM metaset on 2 node Solaris cluster storage replicated to non-clustered Solaris node

Hi, Is it possible to have a Solaris cluster of 2 nodes at SITE-A using SVM and creating metaset using say 2 LUNs (on SAN). Then replicating these 2 LUNs to remote site SITE-B via storage based replication and then using these LUNs by importing them as a metaset on a server at SITE-B which is... (0 Replies)
Discussion started by: dn2011
0 Replies

2. Solaris

Single node Sun Cluster with zones

How can we add a shared zfs dataset between 2 zones on a same host. I have sun cluster 3.2 installed in a server which has 2 zones. I want to share a zfs data set between these 2 zones how can we do that ? (7 Replies)
Discussion started by: fugitive
7 Replies

3. Solaris

How to remove single node cluster

Hi Gurus, I am very new to clustering and for test i have created a single node cluster, now i want to remove the system from cluster. Did some googling however as a newbee in cluster unable to co related the info. Please help Thanks (1 Reply)
Discussion started by: kumarmani
1 Replies

4. HP-UX

Identify cluster active node

Hello, Is there any way to identify the active node in a HP-UX cluster without root privileges? (3 Replies)
Discussion started by: psimoes79
3 Replies

5. Solaris

Active Sun cluster node?

I now the logical name and Virtual IP of the cluster. How can I find the active sun cluster node without having root access? (3 Replies)
Discussion started by: sreeniatbp
3 Replies

6. Solaris

How to configure zones into Single node cluster

Hi Gurus, For learning purpose, I have installed a single node cluster 3.2 on Solaris 10 for practice. Now I am welling to create two non-global zone and create them as a fail over. Will appreciate your help and assistance. Thanks (3 Replies)
Discussion started by: newadmin
3 Replies

7. High Performance Computing

Setting up 2 node cluster using solaris 10

hi, i am trying to setup a 2 node cluster environment. following is what i have; 1. 2 x sun ultra60 - 450MHz procs, 1GB RAM, 9GB HDD, solaris 10 2. 2 x HBA cards 3. 2 x Connection leads to connect ultra60 with D1000 4. 1 x D1000 storage box. 5. 3 x 9GB HDD + 2 x 36GB HDD first of all,... (1 Reply)
Discussion started by: solman17
1 Replies

8. HP-UX

MC/SG Fail to join cluster node

Hi, Please advise me whereas I have two node cluster server configured with MC/SG. Application and DB are running on Node 1, while Node 2 is standby. All the volume group devices are part of cluster environment. There is only one package running at node 1. Node 2 is having the problem to... (1 Reply)
Discussion started by: rauphelhunter
1 Replies

9. HP-UX

Node can't join cluster

Need help guys! when running cmrunnode batch i'm getting this error cmrunnode : Waiting for cluster to... (1 Reply)
Discussion started by: Tris
1 Replies

10. Shell Programming and Scripting

The other node name of a SUN cluster

Hello, Under ksh I have to run a script on one of the nodes of a Solaris 8 cluster which at some time must execute a command on the alternate node: # rsh <name> "command" I have to implement this script on all the clusters of my company (a lot of...). Fortunately, the names of the two nodes... (11 Replies)
Discussion started by: heartwork
11 Replies
Login or Register to Ask a Question