Login or Register to Ask a Question and Join Our Community


Linux heartbeat on redhat 4:node dead


 
Thread Tools Search this Thread
Top Forums UNIX for Advanced & Expert Users Linux heartbeat on redhat 4:node dead
# 1  
Old 04-23-2009
Linux heartbeat on redhat 4:node dead

Hi.
I have started heartbeat on two redhat servers. Using eth0.
Before I start heartbeat I can ping the two server to each other.
Once I start heartbeat both the server become active as they both have warnings that the other node is dead.
Also I am not able to ping each other. After stopping heartbeat the ping works again.

My configuration files are
for server1(192.168.10.43 and its router is 192.168.10.5)
haresources
-----------
ems1 192.168.20.163/24/etho0 netc
ha.cf
------
# /etc/ha.d/ha.cf

# File to write debug messages to
debugfile /var/log/ha-debug

# File to write log messages to
logfile /var/log/ha-log

# Facility to use for syslog()/logger
logfacility local0

# Hearbeat timers
# Refer heartbeat FAQ for how to use these timers.
keepalive 2
deadtime 30
warntime 10
initdead 30

# UDP port used for bcast/ucast communication!
udpport 694

# Interface to broadcast heartbeats over
bcast eth0

# Specify the eth1's IP Address of the other machine EMS server
ucast eth0 192.168.20.162

# Enable automatic failback.
auto_failback on

# Node name in the cluster. Node name must match uname -n
node ems1
node ems2

# Enter a reliable IP address. For ex IP address of the router.
ping 192.168.10.5

# Less commong option.
apiauth ipfail uid=hacluster
apiauth ccm uid=hacluster
apiauth ping gid=haclient uid=root
apiauth default gid=haclient
msgfmt netstring
for server2(192.168.20.162 and its router is 192.168.20.1)
haresources
-----------
ems1 192.168.20.163/24/eth0 netc
ha.cf
------
# /etc/ha.d/ha.cf

# File to write debug messages to
debugfile /var/log/ha-debug

# File to write log messages to
logfile /var/log/ha-log

# Facility to use for syslog()/logger
logfacility local0

# Hearbeat timers
# Refer heartbeat FAQ for how to use these timers.
keepalive 2
deadtime 30
warntime 10
initdead 30

# UDP port used for bcast/ucast communication!
udpport 694

# Interface to broadcast heartbeats over
bcast eth0

# Specify the eth1's IP Address of the other machine EMS server
ucast eth0 192.168.10.43

# Enable automatic failback.
auto_failback on

# Node name in the cluster. Node name must match uname -n
node ems1
node ems2

# Enter a reliable IP address. For ex IP address of the router.
ping 192.168.20.1

# Less commong option.
apiauth ipfail uid=hacluster
apiauth ccm uid=hacluster
apiauth ping gid=haclient uid=root
apiauth default gid=haclient
msgfmt netstring

And here is logs for server1
heartbeat: 2009/04/23_04:52:28 info: Configuration validated. Starting heartbeat 1.2.3.cvs.20050404
heartbeat: 2009/04/23_04:52:28 info: heartbeat: version 1.2.3.cvs.20050404
heartbeat: 2009/04/23_04:52:28 info: Heartbeat generation: 16
heartbeat: 2009/04/23_04:52:28 info: UDP Broadcast heartbeat started on port 694 (694) interface eth0
heartbeat: 2009/04/23_04:52:28 info: ucast: write socket priority set to IPTOS_LOWDELAY on eth0
heartbeat: 2009/04/23_04:52:28 info: ucast: bound send socket to device: eth0
heartbeat: 2009/04/23_04:52:28 info: ucast: bound receive socket to device: eth0
heartbeat: 2009/04/23_04:52:28 info: ucast: started on port 694 interface eth0 to 192.168.20.162
heartbeat: 2009/04/23_04:52:28 info: ping heartbeat started.
heartbeat: 2009/04/23_04:52:28 info: pid 18143 locked in memory.
heartbeat: 2009/04/23_04:52:28 info: Local status now set to: 'up'
heartbeat: 2009/04/23_04:52:29 info: pid 18146 locked in memory.
heartbeat: 2009/04/23_04:52:29 info: pid 18152 locked in memory.
heartbeat: 2009/04/23_04:52:29 info: pid 18148 locked in memory.
heartbeat: 2009/04/23_04:52:29 info: pid 18151 locked in memory.
heartbeat: 2009/04/23_04:52:29 info: pid 18150 locked in memory.
heartbeat: 2009/04/23_04:52:29 info: pid 18147 locked in memory.
heartbeat: 2009/04/23_04:52:29 info: Link ems1:eth0 up.
heartbeat: 2009/04/23_04:52:29 info: pid 18149 locked in memory.
heartbeat: 2009/04/23_04:52:31 info: Link ems2:eth0 up.
heartbeat: 2009/04/23_04:52:31 info: Status update for node ems2: status up
heartbeat: 2009/04/23_04:52:31 info: Running /etc/ha.d/rc.d/status status
heartbeat: 2009/04/23_04:53:00 WARN: node 192.168.10.5: is dead
heartbeat: 2009/04/23_04:53:00 info: Local status now set to: 'active'
heartbeat: 2009/04/23_04:53:00 info: Status update for node ems2: status active
heartbeat: 2009/04/23_04:53:00 info: Running /etc/ha.d/rc.d/status status
heartbeat: 2009/04/23_04:53:00 info: Running /etc/ha.d/rc.d/status status
heartbeat: 2009/04/23_04:53:12 info: local resource transition completed.
heartbeat: 2009/04/23_04:53:12 info: Initial resource acquisition complete (T_RESOURCES(us))
heartbeat: 2009/04/23_04:53:12 info: remote resource transition completed.
heartbeat: 2009/04/23_04:53:12 info: Local Resource acquisition completed.
heartbeat: 2009/04/23_04:53:12 info: Running /etc/ha.d/rc.d/ip-request-resp ip-request-resp
heartbeat: 2009/04/23_04:53:12 received ip-request-resp 192.168.20.163/24/eth0/192.168.20.255 OK yes
heartbeat: 2009/04/23_04:53:12 info: Acquiring resource group: ems1 192.168.20.163/24/eth0/192.168.20.255 netc
heartbeat: 2009/04/23_04:53:13 info: Running /etc/ha.d/resource.d/IPaddr 192.168.20.163/24/eth0/192.168.20.255 start
heartbeat: 2009/04/23_04:53:13 info: /sbin/ifconfig eth0:0 192.168.20.163 netmask 255.255.255.0 broadcast 192.168.20.255
heartbeat: 2009/04/23_04:53:13 info: Sending Gratuitous Arp for 192.168.20.163 on eth0:0 [eth0]
heartbeat: 2009/04/23_04:53:13 /usr/lib/heartbeat/send_arp -i 1010 -r 5 -p /var/lib/heartbeat/rsctmp/send_arp/send_arp-192.168.20.163 eth0 192.168.20.163 auto 192.168.20.163 ffffffffffff
heartbeat: 2009/04/23_04:53:13 info: Running /etc/init.d/netc start
heartbeat: 2009/04/23_04:53:46 ERROR: Both machines own our resources!
heartbeat: 2009/04/23_04:53:47 ERROR: Both machines own our resources!

Also don't understand why I am getting these errors at the end ERROR: Both machines own our resources!

Thanks,
Amrita
# 2  
Old 04-23-2009
Hi,
I have also added logs for server2
heartbeat: 2009/04/23_03:54:47 info: Configuration validated. Starting heartbeat 1.2.3.cvs.20050404
heartbeat: 2009/04/23_03:54:47 info: heartbeat: version 1.2.3.cvs.20050404
heartbeat: 2009/04/23_03:54:47 info: Heartbeat generation: 14
heartbeat: 2009/04/23_03:54:47 info: UDP Broadcast heartbeat started on port 694 (694) interface eth0
heartbeat: 2009/04/23_03:54:47 info: ucast: write socket priority set to IPTOS_LOWDELAY on eth0
heartbeat: 2009/04/23_03:54:47 info: ucast: bound send socket to device: eth0
heartbeat: 2009/04/23_03:54:47 info: ucast: bound receive socket to device: eth0
heartbeat: 2009/04/23_03:54:47 info: ucast: started on port 694 interface eth0 to 192.168.10.43
heartbeat: 2009/04/23_03:54:47 info: ping heartbeat started.
heartbeat: 2009/04/23_03:54:47 info: pid 13477 locked in memory.
heartbeat: 2009/04/23_03:54:47 info: Local status now set to: 'up'
heartbeat: 2009/04/23_03:54:48 info: pid 13480 locked in memory.
heartbeat: 2009/04/23_03:54:48 info: pid 13483 locked in memory.
heartbeat: 2009/04/23_03:54:48 info: pid 13482 locked in memory.
heartbeat: 2009/04/23_03:54:48 info: pid 13484 locked in memory.
heartbeat: 2009/04/23_03:54:48 info: pid 13481 locked in memory.
heartbeat: 2009/04/23_03:54:48 info: Link ems1:eth0 up.
heartbeat: 2009/04/23_03:54:48 info: Status update for node ems1: status up
heartbeat: 2009/04/23_03:54:48 info: Link ems2:eth0 up.
heartbeat: 2009/04/23_03:54:48 info: pid 13485 locked in memory.
heartbeat: 2009/04/23_03:54:48 info: pid 13486 locked in memory.
heartbeat: 2009/04/23_03:54:48 info: Running /etc/ha.d/rc.d/status status
heartbeat: 2009/04/23_03:55:17 WARN: node 192.168.20.1: is dead
heartbeat: 2009/04/23_03:55:17 info: Local status now set to: 'active'
heartbeat: 2009/04/23_03:55:17 info: Status update for node ems1: status active
heartbeat: 2009/04/23_03:55:17 info: Running /etc/ha.d/rc.d/status status
heartbeat: 2009/04/23_03:55:17 info: Running /etc/ha.d/rc.d/status status
heartbeat: 2009/04/23_03:55:29 info: local resource transition completed.
heartbeat: 2009/04/23_03:55:29 info: Initial resource acquisition complete (T_RESOURCES(us))
heartbeat: 2009/04/23_03:55:29 info: remote resource transition completed.
heartbeat: 2009/04/23_03:55:29 info: No local resources [/usr/lib/heartbeat/ResourceManager listkeys ems2] to acquire.
heartbeat: 2009/04/23_03:56:03 WARN: node ems1: is dead
heartbeat: 2009/04/23_03:56:03 WARN: No STONITH device configured.
heartbeat: 2009/04/23_03:56:03 WARN: Shared disks are not protected.
heartbeat: 2009/04/23_03:56:03 info: Resources being acquired from ems1.
heartbeat: 2009/04/23_03:56:03 info: Link ems1:eth0 dead.
heartbeat: 2009/04/23_03:56:03 info: Running /etc/ha.d/rc.d/status status
heartbeat: 2009/04/23_03:56:03 info: No local resources [/usr/lib/heartbeat/ResourceManager listkeys ems2] to acquire.
heartbeat: 2009/04/23_03:56:03 info: Taking over resource group 192.168.20.163/24/eth0
heartbeat: 2009/04/23_03:56:03 info: Acquiring resource group: ems1 192.168.20.163/24/eth0 netc


So both the server are taking up resources.
Can someone please help, why the server stop seeing each other after heartbeat starts.
Thanks,
Amrita
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. AIX

Crash dump and Panic message : RSCT Dead Man Switch Timeout for HACMP; halting non-responsive node

Dear all i have two aix system -Model : P770 -OS version: AIX 6.1 -patch level : 6100-07-04-1216 -ha version : HACMP v 6.1.0.8 -host : A, B last Wednesday, my B system suddenly went down with crash dump. after 1 minute, A system went down with crash dump. I checked the dump of A-system... (6 Replies)
Discussion started by: tomato00
6 Replies

2. HP-UX

Mount FIle systems from node-1 onto node-2

Hi, We have HP UX service guard cluster on OS 11.23. Recently 40+ LUNs presented to both nodes by SAN team but I was asked to mount them on only one node. I created required VGs/LVs, created VxFS and mounted all of them and they are working fine. Now client requested those FS on 2nd node as... (4 Replies)
Discussion started by: prvnrk
4 Replies

3. Homework & Coursework Questions

Accessing one UNIX node from another node of the same server

Hi Experts, I am in need of running a script from one node say node 1 via node 2. My scheduling tool dont have access to node2 , so i need to invoke the list file from node1 but the script needs to run from node2. because the server to which i am hitting, is having access only for the node... (5 Replies)
Discussion started by: arun1377
5 Replies

4. Red Hat

Difference Redhat Linux/RH Enterprise Linux

what is the difference between Redhat Linux and Redhat Enterprise Linux. whereas Redhat linux have Server installation options too. (2 Replies)
Discussion started by: hananabbas
2 Replies

5. Red Hat

Problem in RedHat Cluster Node while network Failure or in Hang mode

Hi, We are having many RedHat linux Server with Cluster facility for availability of service like HTTPD / MySQL. We face some issue while some issue related to power disturbance / fluctuation or Network failure. There is two Cluster Node configured in... (0 Replies)
Discussion started by: hirenkmistry
0 Replies

6. Solaris

SVM metaset on 2 node Solaris cluster storage replicated to non-clustered Solaris node

Hi, Is it possible to have a Solaris cluster of 2 nodes at SITE-A using SVM and creating metaset using say 2 LUNs (on SAN). Then replicating these 2 LUNs to remote site SITE-B via storage based replication and then using these LUNs by importing them as a metaset on a server at SITE-B which is... (0 Replies)
Discussion started by: dn2011
0 Replies

7. UNIX for Advanced & Expert Users

Oracle RAC Cluster on RedHat Linux, Node eviction

We have 2 node Oracle RAC Cluster. It is running RHEL 5 (2.6.18-92.1.10.el5PAE) Hardware is HP DL360 We have node eviction issue, Oracle evicts node. It is not very consistent. Oracle has looked all log files and said this is not Oracle issue but rather system or os issue. Server does not... (7 Replies)
Discussion started by: sanjay92
7 Replies

8. UNIX for Dummies Questions & Answers

Heartbeat configuring in Redhat

hi, I'm currently trying to configure Linux heartbeat on my two Linux servers(where SMPP service is running) my two machines are in two different location with different notworks (primary is in 192.168.x.x and secondary is in 10.48.x.x network) I want to know whether is it possible to... (0 Replies)
Discussion started by: asela115
0 Replies

9. UNIX for Advanced & Expert Users

Linux-Heartbeat Email Notification

hi guys I hope this goes here Have someone used Linux heartbeat to send email when the Slave server becomes the Master? I've read I can configure the MailTo under /etc/ha.d/resource.d but I really don't know how to do it. I basically need my primary server to send an email when it... (2 Replies)
Discussion started by: karlochacon
2 Replies

10. Linux

linux-heartbeat on Solaris 9

has anyone installed linux-heartbeat on Solaris 9?? If yes, which version??? which is the best compiler to build it?? cc, ucbcc, gcc?? what other packages are needed to build it besides m4, autoconf, automake and libtool?? what GNU tools are needed??? thanks a lot (0 Replies)
Discussion started by: eldiego
0 Replies
Login or Register to Ask a Question

Featured Tech Videos