Sponsored Content
Operating Systems AIX Crash dump and Panic message : RSCT Dead Man Switch Timeout for HACMP; halting non-responsive node Post 303042416 by zxmaus on Tuesday 24th of December 2019 01:08:46 AM
Old 12-24-2019
You lost all heartbeats from node 1 to node 2 - thats the reason for the crash. This might happen when your system is simply too busy - but since you should have both heartbeat on disk and heartbeat via network, you should think that there is time enough to send at least one every couple of seconds, Your cluster heartbeat settings might be too tight - giving it more time for the heartbeat might help preventing this issue in the future.
Just out of curiosity - using GPFS and HACMP and RAC on the same systems appears to me to be a completely unnecessary setup, as you are running essentially 3 different cluster products on a system when RAC alone would suffice. Why ?
This User Gave Thanks to zxmaus For This Post:
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

help, what is the difference between core dump and panic dump?

help, what is the difference between core dump and panic dump? (1 Reply)
Discussion started by: aileen
1 Replies

2. HP-UX

crash dump

hi friends, i know that when there is a crash then that memory image is put into /var/adm/crash but if the system hangs up and if i have access to console of that machine then how can i take the crash dump manully. thanks (2 Replies)
Discussion started by: mxms755
2 Replies

3. Solaris

crash dump

Can anyone of you help me in enabling crash dump on Solaris 5.5.1 (1 Reply)
Discussion started by: csreenivas
1 Replies

4. AIX

Node Switch Reasons in HACMP

Hi Guys, I have two nodes clustered. Each node is AIX 5.2 & they are clustered with HACMP 5.2. The mode of the cluster is Active/Passive which mean one node is the Active node & have all resource groups on it & the 2nd node is standby. Last Monday I noted that all resource groupes have been... (2 Replies)
Discussion started by: aldowsary
2 Replies

5. Solaris

crash dump

hi , i have machine that is crashed how i can enable core dump file & how can i find it ? :confused: (4 Replies)
Discussion started by: lid-j-one
4 Replies

6. UNIX for Advanced & Expert Users

Linux heartbeat on redhat 4:node dead

Hi. I have started heartbeat on two redhat servers. Using eth0. Before I start heartbeat I can ping the two server to each other. Once I start heartbeat both the server become active as they both have warnings that the other node is dead. Also I am not able to ping each other. After stopping... (1 Reply)
Discussion started by: amrita garg
1 Replies

7. AIX

hacmp in a 7 node configuration ?

Hi Guys, I have to design a multinode hacmp cluster and am not sure if the design I am thinking of makes any sense. I have to make an environment that currently resides on 5 nodes more resilient but I have the constrain of only having 4 frames. In addition the business doesnt want to pay for... (7 Replies)
Discussion started by: zxmaus
7 Replies

8. AIX

HACMP switch over

Hi I had an active passive cluster. Node A went down and all resource groups moved to Node B. Now we brought up Node A. What is the procedure to bring everything back to Node A. Node A #lssrc -a | grep cl clcomdES clcomdES 323782 active clstrmgrES cluster... (9 Replies)
Discussion started by: samsungsamsung
9 Replies

9. HP-UX

Prevent crash dump when SG cluster node reboots

Hi Experts, I have configured HP-UX Service Guard cluster and it dumps crash every time i reboot a cluster node. Can anyone please help me to prevent these unnecessary crash dumps at the time of rebooting SG cluster node? Thanks in advance. Vaishey (2 Replies)
Discussion started by: Vaishey
2 Replies

10. OS X (Apple)

MacOS 10.15.2 Catalina display crash and system panic

MacPro (2013) 12-Core, 64GB RAM (today's crash): panic(cpu 2 caller 0xffffff7f8b333ad5): userspace watchdog timeout: no successful checkins from com.apple.WindowServer in 120 seconds service: com.apple.logd, total successful checkins since load (318824 seconds ago): 31883, last successful... (3 Replies)
Discussion started by: Neo
3 Replies
CL_STATUS(1)							   User commands						      CL_STATUS(1)

NAME
cl_status - Check status of the High-Availability Linux (Linux-HA) subsystem SYNOPSIS
cl_status sub-command options parameters DESCRIPTION
cl_status is used to check the status of the High-Availability Linux subsystem. SUPPORTED SUB-COMMANDS hbstatus Indicate if heartbeat is running on the local system. listnodes List the nodes in the cluster. nodetype ping|normal List the nodes of the given type. Note Ping nodes are obsolete in Pacemaker cluster, having been replaced with the pingd resource agent. listhblinks node List the network interfaces used as heartbeat links. node should be specified as listed in the ha.cf(5) file for the cluster. hblinkstatus node link Show the status of a heartbeat link. node should be specified as listed in the ha.cf(5) file for the cluster. link should be as per the output of the listhblinks subcommand. clientstatus node client [timeout] Show the status of heartbeat clients. node and client should be specified as listed in the ha.cf(5) file for the cluster. Timeout is in milliseconds, the default is 100ms. rscstatus Show the status of cluster resources. Status will be one of: local, foreign, all or none. Note This option is deprecated, it is obsolete in Pacemaker clusters. parameter -p parameter Retrieve the value of cluster parameters. The parameters may be one of the following: apiauth, auto_failback, baud, debug, debugfile, deadping, deadtime, hbversion, hopfudge, initdead, keepalive, logfacility, logfile, msgfmt, nice_failback, node, normalpoll, stonith, udpport, warntime, watchdog. Note Some of these options are deprecated; see ha.cf(5) OPTIONS
The following options are supported by heartbeat: -m Make the output more human readable. The default output should be easier for scripts to parse. Available with all commands. -p List only 'ping' nodes. Available with listnodes sub-command. Note Ping nodes are obsolete in Pacemaker cluster, having been replaced with the pingd resource agent. -n List only 'normal' nodes. Available with listnodes sub-command. SEE ALSO
heartbeat(8), ha.cf(5), authkeys(5) AUTHORS
Alan Robertson <alanr@unix.sh> cl_status Juan Pedro Paredes Caballero <juampe@retemail.es> man page Simon Horman <horms@verge.net.au> man page Florian Haas <florian.haas@linbit.com> man page Heartbeat 3.0.5 24 Nov 2009 CL_STATUS(1)
All times are GMT -4. The time now is 10:56 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy