The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > UNIX for Dummies Questions & Answers
.
google unix.com




View Single Post in the UNIX and Linux Forums - Click on the Thread or Permalink to View Entire Thread -->
  #5 (permalink)  
Old 06-26-2005
TioTony's Avatar
TioTony TioTony is offline Forum Advisor  
Bit Pusher
  
 

Join Date: Oct 2001
Location: Southern California
Posts: 332
I had a similar problem but is was RHAS 2.1 and RHAS 3.0 on the same system. 2.1 would just die every once and a while but 3.0 was fine. It was caeued by the hangcheck timer (or watchdog/softdog). Heavy disk IO was causing the timer to fail to check in which would cause the sytem to reboot. Because it was a hard reboot syslogd wouldn't have time to write to the /var/logs so it took a while to figure out what the problem was. I caught it once we had built the RHAS 2.1 cluster with the machine. The other node would have STONITH messages in it's log. I guess the end of this story is to do an 'lsmod' and see if you have any similar modules installed that might react this way.