12-19-2019
Crash dump and Panic message : RSCT Dead Man Switch Timeout for HACMP; halting non-responsive node
Dear all
i have two aix system
-Model : P770
-OS version: AIX 6.1
-patch level : 6100-07-04-1216
-ha version : HACMP v 6.1.0.8
-host : A, B
last Wednesday, my B system suddenly went down with crash dump. after 1 minute, A system went down with crash dump. I checked the dump of A-system using kdb command and found the following:
PANIC MESSAGES:
RSCT Dead Man Switch Timeout for HACMP; halting non-responsive node
So, I was convinced that the phrase was the cause of the down of A-system. Is my judgment correct?
And I looked for what dead man switch is. but it was not well understood.
Why did the dead man switch bring down A-system?
Please anybody explain to me?
10 More Discussions You Might Find Interesting
1. UNIX for Dummies Questions & Answers
help, what is the difference between core dump and panic dump? (1 Reply)
Discussion started by: aileen
1 Replies
2. HP-UX
hi friends,
i know that when there is a crash then that memory image is
put into /var/adm/crash
but if the system hangs up and if i have access to console of
that machine then how can i take the crash dump manully.
thanks (2 Replies)
Discussion started by: mxms755
2 Replies
3. Solaris
Can anyone of you help me in enabling crash dump on Solaris 5.5.1 (1 Reply)
Discussion started by: csreenivas
1 Replies
4. AIX
Hi Guys,
I have two nodes clustered. Each node is AIX 5.2 & they are clustered with HACMP 5.2. The mode of the cluster is Active/Passive which mean one node is the Active node & have all resource groups on it & the 2nd node is standby.
Last Monday I noted that all resource groupes have been... (2 Replies)
Discussion started by: aldowsary
2 Replies
5. Solaris
hi ,
i have machine that is crashed
how i can enable core dump file & how can i find it ? :confused: (4 Replies)
Discussion started by: lid-j-one
4 Replies
6. UNIX for Advanced & Expert Users
Hi.
I have started heartbeat on two redhat servers. Using eth0.
Before I start heartbeat I can ping the two server to each other.
Once I start heartbeat both the server become active as they both have warnings that the other node is dead.
Also I am not able to ping each other. After stopping... (1 Reply)
Discussion started by: amrita garg
1 Replies
7. AIX
Hi Guys,
I have to design a multinode hacmp cluster and am not sure if the design I am thinking of makes any sense.
I have to make an environment that currently resides on 5 nodes more resilient but I have the constrain of only having 4 frames. In addition the business doesnt want to pay for... (7 Replies)
Discussion started by: zxmaus
7 Replies
8. AIX
Hi
I had an active passive cluster. Node A went down and all resource groups moved to Node B.
Now we brought up Node A. What is the procedure to bring everything back to Node A.
Node A #lssrc -a | grep cl
clcomdES clcomdES 323782 active
clstrmgrES cluster... (9 Replies)
Discussion started by: samsungsamsung
9 Replies
9. HP-UX
Hi Experts,
I have configured HP-UX Service Guard cluster and it dumps crash every time i reboot a cluster node. Can anyone please help me to prevent these unnecessary crash dumps at the time of rebooting SG cluster node?
Thanks in advance.
Vaishey (2 Replies)
Discussion started by: Vaishey
2 Replies
10. OS X (Apple)
MacPro (2013) 12-Core, 64GB RAM (today's crash):
panic(cpu 2 caller 0xffffff7f8b333ad5): userspace watchdog timeout: no successful checkins from com.apple.WindowServer in 120 seconds
service: com.apple.logd, total successful checkins since load (318824 seconds ago): 31883, last successful... (3 Replies)
Discussion started by: Neo
3 Replies
LEARN ABOUT OSF1
expand_dump
expand_dump(8) System Manager's Manual expand_dump(8)
NAME
expand_dump - Produces a non-compressed kernel crash dump file
SYNOPSIS
/usr/sbin/expand_dump input-file output-file
DESCRIPTION
By default, kernel crash dump files (vmzcore.#) are compressed during the crash dump. Compressed core files can be examined by the latest
versions of debugging tools that have been recompiled to support compressed crash dump files. However, not all debugging tools may be
upgraded on a given system, or you may want to examine a crash dump from a remote system using an older version of a tool. The expand_dump
utility produces a file that can be read by tools that have not been upgraded to support compressed crash dump files. This non-compressed
version can also be read by any upgraded tool.
This utility can only be used with compressed crash dump files, and does not support any other form of compressed file. You cannot use
other decompression tools such as compress, gzip, or zip on a compressed crash dump file.
Note that the non-compressed file will require significantly more disk storage space as it is possible to achieve compression ratios of up
to 60:1. Check the available disk space before running expand_dump and estimate the size of the non-compressed file as follows: Run tests
by halting your system and forcing a crash as described in the Kernel Debugging manual. Use an upgraded debugger to determine the value of
the variable dumpsize. Multiply this vale by the 8Kb page size to approximate the required disk space of the non-compressed crash-dump.
Run expand_dump and pipe the output file to /dev/null, noting the size of the file that is printed when expand_dump completes its task.
RETURN VALUES
Successful completion of the decompression. The user did not supply the correct number of command line arguments. The input file could
not be read. The input file is not a compressed dump, or is corrupted. The output file could not be created or opened for writing and
truncated. There was some problem writing to the output file (probably a full disk). The input file is not formated consistantly. It is
probably corrupted. The input file could not be correctly decompressed. It is probably corrupted.
EXAMPLES
expand_dump vmzcore.4 vmcore.4
SEE ALSO
Commands: dbx(1), kdbx(8), ladebug(1), savecore(8)
Kernel Debugging
System Administration
expand_dump(8)