Cluster failure reason


 
Thread Tools Search this Thread
Operating Systems AIX Cluster failure reason
# 1  
Old 01-24-2014
I'd take a different starting point. Instead of looking for possible failures I'd define what resources have to be up and running to say that the cluster node is ok. If you keep in mind that the final goal of a cluster is to garantee the availability of a service and not the detection of errors your script may look a bit different, while the list of resources is pretty much what XrAy wrote.
# 2  
Old 01-24-2014
Yes, exactly. There are more possible failure modes than one brain can imagine, but a very finite list of things your cluster is supposed to be providing and resources it uses to run.
# 3  
Old 01-24-2014
Make sure you are pining the Persistent IP and NOT the Service IP, because Service IP will jump between the nodes, whereas persistent IP is hard bounded to the node.

You can check the cluster services, and the cluster state, and write a wrapper script to send an email if any of those goes south.
And ofcourse taking into consideration all the valuable suggestions given by forum members.
This User Gave Thanks to ibmtech For This Post:
# 4  
Old 01-24-2014
The backbone of the highavailability is the nodes in a cluster checking each other so you should look into heartbeats. They are usually implemented by sharing a disk space, like a concurrent accessable VG, which is rather small and have the nodes write in there some bits and by the freshness of it the nodes can decide who is still up and alive.
Additionally there is heartbeating via network interfaces. Some even use or used serial interfaces etc.
This is an important part of HACMP/PowerHA and other Cluster Technologies.

Have a look here:
Heartbeating in HACMP - AIX 6.1 Information Center
This User Gave Thanks to zaxxon For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

8 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to get reason for ping failure using perls Net::Ping->new("icmp");?

Hi I am using perl to ping a list of nodes - with script below : $p = Net::Ping->new("icmp"); if ($p->ping($host,1)){ print "$host is alive.\n"; } else { print "$host is unreacheable.\n"; } $p->close();... (4 Replies)
Discussion started by: tavanagh
4 Replies

2. Red Hat

Problem in RedHat Cluster Node while network Failure or in Hang mode

Hi, We are having many RedHat linux Server with Cluster facility for availability of service like HTTPD / MySQL. We face some issue while some issue related to power disturbance / fluctuation or Network failure. There is two Cluster Node configured in... (0 Replies)
Discussion started by: hirenkmistry
0 Replies

3. Solaris

Sun cluster 4.0 - zone cluster failover doubt

Hello experts - I am planning to install a Sun cluster 4.0 zone cluster fail-over. few basic doubts. (1) Where should i install the cluster s/w binaries ?. ( global zone or the container zone where i am planning to install the zone fail-over) (2) Or should i perform the installation on... (0 Replies)
Discussion started by: NVA
0 Replies

4. UNIX for Dummies Questions & Answers

boot up failure unix sco after power failure

hi power went out. next day unix sco wont boot up error code 303. any help appreciated as we are clueless. (11 Replies)
Discussion started by: fredthayer
11 Replies

5. Solaris

Sun cluster and Veritas cluster question.

Yesterday my customer told me to expect a vcs upgrade to happen in the future. He also plans to stop using HDS and move to EMC. Am thinking how to migrate to sun cluster setup instead. My plan as follows leave the existing vcs intact as a fallback plan. Then install and build suncluster on... (5 Replies)
Discussion started by: sparcguy
5 Replies

6. Solaris

Subject: Sun Cluster 3.2.2 Apache HA failure, or cludge?

I folks, season's greetings. Hope you had a good festive season. I've got 2 related problems on the same Sun Cluster 3.2.2 Apache 2.0.63 cluster: clsetup error: ERROR: Failed to get connection to node localhost SunOS... (0 Replies)
Discussion started by: cluster
0 Replies

7. High Performance Computing

SUN Cluster Vs Veritas Cluster

Dear All, Can anyone explain about Pros and Cons of SUN and Veritas Cluster ? Any comparison chart is highly appreciated. Regards, RAA (4 Replies)
Discussion started by: RAA
4 Replies

8. High Performance Computing

Building a Solaris Cluster Express cluster in a VirtualBox on OpenSolaris

Provides a description of how to set up a Solaris Cluster Express cluster in a VirtualBox on OpenSolaris. More... (0 Replies)
Discussion started by: Linux Bot
0 Replies
Login or Register to Ask a Question