Sponsored Content
Full Discussion: Cluster failure reason
Operating Systems AIX Cluster failure reason Post 302885312 by zaxxon on Friday 24th of January 2014 02:29:35 PM
Old 01-24-2014
The backbone of the highavailability is the nodes in a cluster checking each other so you should look into heartbeats. They are usually implemented by sharing a disk space, like a concurrent accessable VG, which is rather small and have the nodes write in there some bits and by the freshness of it the nodes can decide who is still up and alive.
Additionally there is heartbeating via network interfaces. Some even use or used serial interfaces etc.
This is an important part of HACMP/PowerHA and other Cluster Technologies.

Have a look here:
Heartbeating in HACMP - AIX 6.1 Information Center
This User Gave Thanks to zaxxon For This Post:
 

8 More Discussions You Might Find Interesting

1. High Performance Computing

Building a Solaris Cluster Express cluster in a VirtualBox on OpenSolaris

Provides a description of how to set up a Solaris Cluster Express cluster in a VirtualBox on OpenSolaris. More... (0 Replies)
Discussion started by: Linux Bot
0 Replies

2. High Performance Computing

SUN Cluster Vs Veritas Cluster

Dear All, Can anyone explain about Pros and Cons of SUN and Veritas Cluster ? Any comparison chart is highly appreciated. Regards, RAA (4 Replies)
Discussion started by: RAA
4 Replies

3. Solaris

Subject: Sun Cluster 3.2.2 Apache HA failure, or cludge?

I folks, season's greetings. Hope you had a good festive season. I've got 2 related problems on the same Sun Cluster 3.2.2 Apache 2.0.63 cluster: clsetup error: ERROR: Failed to get connection to node localhost SunOS... (0 Replies)
Discussion started by: cluster
0 Replies

4. Solaris

Sun cluster and Veritas cluster question.

Yesterday my customer told me to expect a vcs upgrade to happen in the future. He also plans to stop using HDS and move to EMC. Am thinking how to migrate to sun cluster setup instead. My plan as follows leave the existing vcs intact as a fallback plan. Then install and build suncluster on... (5 Replies)
Discussion started by: sparcguy
5 Replies

5. UNIX for Dummies Questions & Answers

boot up failure unix sco after power failure

hi power went out. next day unix sco wont boot up error code 303. any help appreciated as we are clueless. (11 Replies)
Discussion started by: fredthayer
11 Replies

6. Solaris

Sun cluster 4.0 - zone cluster failover doubt

Hello experts - I am planning to install a Sun cluster 4.0 zone cluster fail-over. few basic doubts. (1) Where should i install the cluster s/w binaries ?. ( global zone or the container zone where i am planning to install the zone fail-over) (2) Or should i perform the installation on... (0 Replies)
Discussion started by: NVA
0 Replies

7. Red Hat

Problem in RedHat Cluster Node while network Failure or in Hang mode

Hi, We are having many RedHat linux Server with Cluster facility for availability of service like HTTPD / MySQL. We face some issue while some issue related to power disturbance / fluctuation or Network failure. There is two Cluster Node configured in... (0 Replies)
Discussion started by: hirenkmistry
0 Replies

8. Shell Programming and Scripting

How to get reason for ping failure using perls Net::Ping->new("icmp");?

Hi I am using perl to ping a list of nodes - with script below : $p = Net::Ping->new("icmp"); if ($p->ping($host,1)){ print "$host is alive.\n"; } else { print "$host is unreacheable.\n"; } $p->close();... (4 Replies)
Discussion started by: tavanagh
4 Replies
VOTEQUORUM_OVERVIEW(8)				    Corosync Cluster Engine Programmer's Manual 			    VOTEQUORUM_OVERVIEW(8)

NAME
votequorum_overview - Votequorum Library Overview OVERVIEW
The votequuorum library is delivered with the corosync project. It is the external interface to the vote-based quorum service. This service is optionally loaded into all ndes in a corosync cluster to avoid split-brain situations. It does this by having a number of votes assigned to each system in the cluster and ensuring that only when a majority of the votes are present, cluster operations are allowed to proceed. The library provides a mechanism to: * Query the quorum status * Get a list of nodes known to the quorum service * Receive notifications of quorum state changes * Change the number of votes assigned to a node * Change the number of expected votes for a cluster to be quorate * Connect an additional quorum device to allow small clusters to remain quorate during node outages. votequorum reads its configuration from the objdb. The following keys are read when it starts up: * quorum.expected_votes * quorum.votes * quorum.quorumdev_poll * quorum.disallowed * quorum.two_node Most of those values can be changed while corosync is running with the following exceptions: quorum.disallowed cannot be changed, and two_node cannot be set on-the-fly, though it can be cleared. ie you can start with two nodes in the cluster and add a third without reboot- ing all the nodes. BUGS
This software is not yet production, so there may still be some bugs. SEE ALSO
corosync-quorumtool(8), votequorum_initialize(3), votequorum_finalize(3), votequorum_fd_get(3), votequorum_dispatch(3), votequorum_con- text_get(3), votequorum_context_set(3), corosync Man Page 2009-01-26 VOTEQUORUM_OVERVIEW(8)
All times are GMT -4. The time now is 08:02 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy