Sponsored Content
Operating Systems AIX Can anyone shed some light on this HACMP failover? Post 302096203 by Wez on Tuesday 14th of November 2006 11:05:34 AM
Old 11-14-2006
Question Can anyone shed some light on this HACMP failover?

Hello All,

Here is a snipet from our cluster.log, I was wondering if anyone could shed some light on what may have caused the failover.

The first two lines indicate a possible memory issue which I am currently looking into.

Quote:

Nov 7 16:30:21 server_01 grpsvcs[16000]: (Recorded using libct_ffdc.a cv 2):::Error ID: 6xYcC4/BO8I3/c2C/4Im5t....................:::Reference ID: :::Template ID: 463a893d:::Details File: :::Location: RSCT,pgsd.C,1.51,195 :::GS_ERROR_ER Internal logic error in Group Services daemon DIAGNOSTIC EXPLANATION Memory allocation failed. Please check the memory availability.
Nov 7 16:30:21 server_01 grpsvcs[16000]: (Recorded using libct_ffdc.a cv 2):::Error ID: 6xYcC4/BO8I3/Ysc/4Im5t....................:::Reference ID: :::Template ID: 463a893d:::Details File: :::Location: RSCT,pgsd.C,1.51,195 :::GS_ERROR_ER Internal logic error in Group Services daemon DIAGNOSTIC EXPLANATION Memory allocation failed. Please check the memory availability.
Nov 7 16:32:10 server_01 clstrmgrES[17318]: Tue Nov 7 16:32:10 SendInfoBcast: ha_gs_send_message() failed rc=1
Nov 7 16:32:10 server_01 clstrmgrES[17318]: Tue Nov 7 16:32:10 clstrmgr on node 1 is exiting with code 4
Nov 7 16:32:10 server_01 haemd[16528]: LPP=PSSP,Fn=emd_gsi.c,SID=1.4.1.33,L#=1361, haemd: 2521-032 Cannot dispatch group services (1).
Nov 7 16:32:11 server_01 clsmuxpdES[17574]: clRGInfoGetRGHandle() failed, error: : The system call does not exist on this system.
Nov 7 16:32:11 server_01 clsmuxpdES[17574]: Error from ha_em_receive_response(): EMAPI error number 10 EMAPI error message 2521-649 An attempt to receive a command response was unsuccessful; read() detected end-of-file; connection with Event Manager lost. : The system call does not exist on this system.
Nov 7 16:32:11 server_01 clsmuxpdES[17574]: Event Manager API Disconnected:: The system call does not exist on this system.
Nov 7 16:32:11 server_01 snmpd[14998]: NOTICE: SMUX packet from (127.0.0.1+32771+1)
Nov 7 16:32:11 server_01 snmpd[14998]: NOTICE: SMUX trap: (6 10) (127.0.0.1+32771+1)
Nov 7 16:32:11 server_01 snmpd[14998]: NOTICE: SMUX packet from (127.0.0.1+32771+1)
Nov 7 16:32:11 server_01 snmpd[14998]: NOTICE: SMUX trap: (6 11) (127.0.0.1+32771+1)
Nov 7 16:32:12 server_01 snmpd[14998]: NOTICE: SMUX packet from (127.0.0.1+32771+1)
Nov 7 16:32:12 server_01 snmpd[14998]: NOTICE: SMUX trap: (6 15) (127.0.0.1+32771+1)
Nov 7 16:32:12 server_01 HACMP for AIX: clexit.rc : Unexpected termination of clstrmgrES.
Nov 7 16:32:12 server_01 HACMP for AIX: clexit.rc : Halting system immediately!!!
Nov 7 17:29:19 server_01 RMCdaemon[11610]: (Recorded using libct_ffdc.a cv 2):::Error ID: 6eKora0TF9I3/6V2/4Im5t....................:::Reference ID: :::Template ID: a6df45aa:::Details File: :::Location: RSCT,rmcd.c,1.34,196 :::RMCD_INFO_0_ST The daemon is started.
Nov 7 17:29:19 server_01 ctcasd[11870]: (Recorded using libct_ffdc.a cv 2):::Error ID: 6YzeY.1TF9I3/UeV/4Im5t....................:::Reference ID: :::Template ID: c092afe4:::Details File: :::Location: rsct.core.sec,ctcas_main.c,1.13,295 :::ctcasd Daemon Started
Thanks.
 

4 More Discussions You Might Find Interesting

1. AIX

failover on 4.5 hacmp

Hi All, How do I trigger the failover on the second hacmp server? Please give me idea and I will do the rest. Thanks, itik (2 Replies)
Discussion started by: itik
2 Replies

2. AIX

hacmp ip load balancer failover

Hi All, How do I failover on the ip load balancer (back and forth)? It involves first to load a new config on the passive ip. If success, load the new config on the ip active (which is now passive). Any idea, please. Thanks in advance. (0 Replies)
Discussion started by: itik
0 Replies

3. AIX

HACMP does not start db2 after failover (db2nodes not getting modified by hacmp)

hi, when I do a failover, hacmp always starts db2 but recently it fails to start db2..noticed the issue is db2nodes.cfg is not modified by hacmp and is still showing primary node..manually changed the node name to secondary after which db2 started immediately..unable to figure out why hacmp is... (4 Replies)
Discussion started by: gkr747
4 Replies

4. AIX

HACMP with VIO, service IP failover

Would anyone please kindly help to solve this problem... An LPAR with the below network configuration. ent0 and ent1 are logical lan (virtual ethernet) from VIO SEA. en0 1.2.3.4 <- boot ip 192.168.1.1 <- persistent ip 192.168.1.10 <- service ip en1 11.22.33.44 <- boot ip When I... (6 Replies)
Discussion started by: skeyeung
6 Replies
scds_fm_sleep(3HA)					 Sun Cluster HA and Data Services					scds_fm_sleep(3HA)

NAME
scds_fm_sleep - wait for a message on a fault monitor control socket SYNOPSIS
cc [flags...] -I /usr/cluster/include file -L /usr/cluster/lib -l dsdev #include <rgm/libdsdev.h> scha_err_t scds_fm_sleep(scds_handle_t handle, time_t timeout DESCRIPTION
Thescds_fm_sleep() function waits for a data service application process tree that running under control of the process monitor facility to die. If no such death occurs within the specified timeout period, the function returns SCHA_ERR_NOERR. If a data service application process tree death occurs, scds_fm_sleep() records SCDS_COMPLETE_FAILURE in the failure history and either restarts the process tree or fails it over according to the algorithm described in the scds_fm_action(3HA) man page. If a failover attempt is unsuccessful, a restart of the application is attempted. If an attempted restart fails, the function returns SCHA_ERR_INTERNAL. Note that if the failure history causes this function to do a failover, and the failover attempt succeeds, scds_fm_sleep() never returns. PARAMETERS
The following parameters are supported: handle The handle returned from scds_initialize(3HA). timeout The timeout period measured in seconds. RETURN VALUES
The scds_fm_sleep() function returns the following: 0 The function succeeded. nonzero The function failed. ERRORS
SCHA_ERR_NOERR Indicates that the process tree has not died. SCHA_ERR_INTERNAL Indicates that the data service application process tree has died and failed to restart. Other values Indicate the function failed. See scha_calls(3HA) for the meaning of failure codes. FILES
/usr/cluster/include/rgm/libdsdev.h Include file /usr/cluster/lib/libdsdev.so Library ATTRIBUTES
See attributes(5) for descriptions of the following attributes: +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +-----------------------------+-----------------------------+ |Availability |SUNWscdev | +-----------------------------+-----------------------------+ |Interface Stability |Evolving | +-----------------------------+-----------------------------+ SEE ALSO
scha_calls(3HA), scds_fm_action(3HA), scds_initialize(3HA), attributes(5) Sun Cluster 3.2 7 Sep 2007 scds_fm_sleep(3HA)
All times are GMT -4. The time now is 12:44 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy