Production unexpectedly server rebooted


 
Thread Tools Search this Thread
Operating Systems Linux Red Hat Production unexpectedly server rebooted
# 1  
Old 01-18-2011
Production unexpectedly server rebooted

I am trying to figure out what might causing Production server unexpectedly reboot during last few months ..

Is auto reboot is set , I can check it is not set during the kernel panic but are they any other parameters which I am missing .


-bash-2.05b$ uname -a
Linux PD1011 2.4.21-53.ELhugemem #1 SMP Wed Nov 14 03:46:17 EST 2007 i686 i686 i386 GNU/Linux
-bash-2.05b$ cat /proc/version
Linux version 2.4.21-53.ELhugemem (brewbuilder@hs20-bc1-7.build.redhat.com) (gcc version 3.2.3 20030502 (Red Hat Linux 3.2.3-58)) #1 SMP Wed Nov 14 03:46:17 EST 2007
-bash-2.05b$

-bash-2.05b$ cat /etc/redhat-release
Red Hat Enterprise Linux AS release 3 (Taroon Update 9)

/var/log/messages

messages.4Smilieec 19 04:02:02 rc1011 syslogd 1.4.1: restart.
messages.3Smilieec 26 04:02:03 rc1011 syslogd 1.4.1: restart.
messages.2:Jan 2 04:02:02 rc1011 syslogd 1.4.1: restart.
messages.1:Jan 9 04:02:03 rc1011 syslogd 1.4.1: restart.
messages.1:Jan 10 03:03:14 rc1011 syslogd 1.4.1: restart.
messages.1:Jan 10 09:40:23 rc1011 syslogd 1.4.1: restart.
messages.1:Jan 12 09:49:45 rc1011 syslogd 1.4.1: restart.
messages.1:Jan 13 05:15:44 rc1011 syslogd 1.4.1: restart.
messages.1:Jan 13 21:48:33 rc1011 syslogd 1.4.1: restart.
messages:Jan 16 04:02:03 rc1011 syslogd 1.4.1: restart.

Detailed Previos Logs :

1st unexpected reboot of 2011
Jan 10 02:35:06 PD1011 cfengine:PD1011[14910]: Packages installed: /usr/bin/yum -y install collectd nrpe blip-nagios-plugins-nrpe
Jan 10 02:36:29 PD1011 gpagentd[3386]: SELinux group policy is not applicable for this system...
Jan 10 02:50:02 PD1011 cfengine:PD1011[22545]: Installing package(s) using '/usr/bin/yum -y install collectd nrpe blip-nagios-plugins-nrpe'
Jan 10 02:50:06 PD1011 cfengine:PD1011[22545]: collectd is installed and is the latest version.
Jan 10 02:50:06 PD1011 cfengine:PD1011[22545]: nrpe is installed and is the latest version.
Jan 10 02:50:06 PD1011 cfengine:PD1011[22545]: blip-nagios-plugins-nrpe is installed and is the latest version.
Jan 10 02:50:06 PD1011 cfengine:PD1011[22545]: Gathering header information file(s) from server(s)
Jan 10 02:50:06 PD1011 cfengine:PD1011[22545]: Server: RHEL 3U8 AS
Jan 10 02:50:06 PD1011 cfengine:PD1011[22545]: Server: RHEL 3U8 AS - latest
Jan 10 02:50:06 PD1011 cfengine:PD1011[22545]: Server: GID
Jan 10 02:50:06 PD1011 cfengine:PD1011[22545]: Server: GID - latest
Jan 10 02:50:06 PD1011 cfengine:PD1011[22545]: Finding updated packages
Jan 10 02:50:06 PD1011 cfengine:PD1011[22545]: Downloading needed headers
Jan 10 02:50:06 PD1011 cfengine:PD1011[22545]: No actions to take
Jan 10 02:50:06 PD1011 cfengine:PD1011[22545]: Packages installed: /usr/bin/yum -y install collectd nrpe blip-nagios-plugins-nrpe
Jan 10 03:03:14 PD1011 syslogd 1.4.1: restart.
Jan 10 03:03:14 PD1011 syslog: syslogd startup succeeded
Jan 10 03:03:14 PD1011 kernel: klogd 1.4.1, log source = /proc/kmsg started.
Jan 10 03:03:14 PD1011 syslog: klogd startup succeeded
Jan 10 03:03:14 PD1011 kernel: Linux version 2.4.21-53.ELhugemem (brewbuilder@hs20-bc1-7.build.redhat.com) (gcc version 3.2.3 20030502 (Red Hat Linux 3.2.3-58)) #1 SMP Wed Nov 14
03:46:17 EST 2007
Jan 10 03:03:14 PD1011 kernel: BIOS-provided physical RAM map:
Jan 10 03:03:14 PD1011 kernel: BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
Jan 10 03:03:14 PD1011 kernel: BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
Jan 10 03:03:14 PD1011 kernel: BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
Jan 10 03:03:14 PD1011 kernel: BIOS-e820: 0000000000100000 - 00000000dfff0000 (usable)
Jan 10 03:03:14 PD1011 kernel: BIOS-e820: 00000000dfff0000 - 00000000dffff000 (ACPI data)
Jan 10 03:03:14 PD1011 kernel: BIOS-e820: 00000000dffff000 - 00000000e0000000 (ACPI NVS)
Jan 10 03:03:14 PD1011 kernel: BIOS-e820: 00000000fec00000 - 00000000fec03000 (reserved)
Jan 10 03:03:14 PD1011 kernel: BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
Jan 10 03:03:14 PD1011 kernel: BIOS-e820: 00000000ffe00000 - 0000000100000000 (reserved)
Jan 10 03:03:14 PD1011 kernel: BIOS-e820: 0000000100000000 - 0000000220000000 (usable)
Jan 10 03:03:14 PD1011 kernel: 4768MB HIGHMEM available.
Jan 10 03:03:14 PD1011 kernel: 3936MB LOWMEM available.
Jan 10 03:03:14 PD1011 kernel: found SMP MP-table at 000ff780

2nd unexpected reboot of 2011
Jan 10 09:20:06 PD1011 cfengine:PD1011[5754]: Packages installed: /usr/bin/yum -y install collectd nrpe blip-nagios-plugins-nrpe
Jan 10 09:33:57 PD1011 gpagentd[3386]: SELinux group policy is not applicable for this system...
Jan 10 09:35:02 PD1011 cfengine:PD1011[13545]: Installing package(s) using '/usr/bin/yum -y install collectd nrpe blip-nagios-plugins-nrpe'
Jan 10 09:35:06 PD1011 cfengine:PD1011[13545]: collectd is installed and is the latest version.
Jan 10 09:35:06 PD1011 cfengine:PD1011[13545]: nrpe is installed and is the latest version.
Jan 10 09:35:06 PD1011 cfengine:PD1011[13545]: blip-nagios-plugins-nrpe is installed and is the latest version.
Jan 10 09:35:06 PD1011 cfengine:PD1011[13545]: Gathering header information file(s) from server(s)
Jan 10 09:35:06 PD1011 cfengine:PD1011[13545]: Server: RHEL 3U8 AS
Jan 10 09:35:06 PD1011 cfengine:PD1011[13545]: Server: RHEL 3U8 AS - latest
Jan 10 09:35:06 PD1011 cfengine:PD1011[13545]: Server: GID
Jan 10 09:35:06 PD1011 cfengine:PD1011[13545]: Server: GID - latest
Jan 10 09:35:06 PD1011 cfengine:PD1011[13545]: Finding updated packages
Jan 10 09:35:06 PD1011 cfengine:PD1011[13545]: Downloading needed headers
Jan 10 09:35:06 PD1011 cfengine:PD1011[13545]: No actions to take
Jan 10 09:35:06 PD1011 cfengine:PD1011[13545]: Packages installed: /usr/bin/yum -y install collectd nrpe blip-nagios-plugins-nrpe
Jan 10 09:40:23 PD1011 syslogd 1.4.1: restart.
Jan 10 09:40:23 PD1011 syslog: syslogd startup succeeded
Jan 10 09:40:23 PD1011 kernel: klogd 1.4.1, log source = /proc/kmsg started.
Jan 10 09:40:23 PD1011 kernel: Linux version 2.4.21-53.ELhugemem (brewbuilder@hs20-bc1-7.build.redhat.com) (gcc version 3.2.3 20030502 (Red Hat Linux 3.2.3-58)) #1 SMP Wed Nov 14
03:46:17 EST 2007
Jan 10 09:40:23 PD1011 kernel: BIOS-provided physical RAM map:
Jan 10 09:40:23 PD1011 kernel: BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
Jan 10 09:40:23 PD1011 kernel: BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
Jan 10 09:40:23 PD1011 kernel: BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
Jan 10 09:40:23 PD1011 kernel: BIOS-e820: 0000000000100000 - 00000000dfff0000 (usable)
Jan 10 09:40:23 PD1011 kernel: BIOS-e820: 00000000dfff0000 - 00000000dffff000 (ACPI data)
Jan 10 09:40:23 PD1011 kernel: BIOS-e820: 00000000dffff000 - 00000000e0000000 (ACPI NVS)
Jan 10 09:40:23 PD1011 kernel: BIOS-e820: 00000000fec00000 - 00000000fec03000 (reserved)
Jan 10 09:40:23 PD1011 kernel: BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
Jan 10 09:40:23 PD1011 kernel: BIOS-e820: 00000000ffe00000 - 0000000100000000 (reserved)
Jan 10 09:40:23 PD1011 kernel: BIOS-e820: 0000000100000000 - 0000000220000000 (usable)
Jan 10 09:40:23 PD1011 kernel: 4768MB HIGHMEM available.
Jan 10 09:40:23 PD1011 kernel: 3936MB LOWMEM available.
Jan 10 09:40:23 PD1011 kernel: found SMP MP-table at 000ff780
# 2  
Old 01-18-2011
Bug

Might be problem of RAM. Reinsert the RAM and check.
# 3  
Old 01-18-2011
I have checked the memory usage which is the same from last 1 year when i have taken the statistics through Wily monitor ..

---------- Post updated 01-19-11 at 07:03 AM ---------- Previous update was 01-18-11 at 07:29 PM ----------

Can some one please reply on this thread as unexpected server reboots are hampering the business & this is into the production
# 4  
Old 01-20-2011
I see the line :
Code:
Packages installed: /usr/bin/yum -y install collectd nrpe blip-nagios-plugins-nrpe

before each server reboot, did you check your Nagios installation ?
# 5  
Old 01-20-2011
Why is the nagios package being reinstalled?
Code:
Jan 10 02:35:06 PD1011 cfengine:PD1011[14910]: Packages installed: /usr/bin/yum -y install collectd nrpe blip-nagios-plugins-nrpe
....
....
Jan 10 09:20:06 PD1011 cfengine:PD1011[5754]: Packages installed: /usr/bin/yum -y install collectd nrpe blip-nagios-plugins-nrpe

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

AIX mount goes away if server rebooted

I have been mounting a directory to share with a windows pc. If i reboot the AIX box the mount goes away. How can i make the mount permanent? Here is the command I use to make the mount exportfs -i -o root=<servername> /path (1 Reply)
Discussion started by: fierfek
1 Replies

2. Red Hat

Server rebooted.

Hi, Yesterday one of Red Hat Server 4.2 got rebooted. I have checked /var/log/messages, but does not find out any serious issue related to peformance / hardware issue. how to find out why server was rebooted? (1 Reply)
Discussion started by: manoj.solaris
1 Replies

3. Red Hat

Server uptime is showing 0hr but server not rebooted

Hi One of our server is showing the uptime 0hr 5mints there is no log in /var/log/messages there is no log in command "last" kernel version is 2.4.9 (RH2.1 AS) What could be the reason for this. is this issue is related to uptime counter reached max how to verify this. Best Regards KVK (4 Replies)
Discussion started by: venikathir
4 Replies

4. Red Hat

Server rebooting unexpectedly

hi, I have been working on Solaris am very new to linux. My concern is as it goes....our server is getting rebooted automatically and I am not able to understand anything from the var log messages. Could anybody help me out in troubleshooting the issue. 2.6.18-128.el5 #1 x86_64 GNU/Linux is... (1 Reply)
Discussion started by: EmbedUX
1 Replies

5. Solaris

How to check when a solaris server got rebooted

In Windows we can check the event viewer for entries 6005,6006,6009 to confirm the system down times, as in when it got down and when it came back up. Is there some similar log files in Solaris/RHEL that I can check the timings and who or what caused the system reboot. I am an absolute newbie. Need... (4 Replies)
Discussion started by: lubu
4 Replies

6. UNIX for Dummies Questions & Answers

To know the server which the production is pointing to?

Hi, How to know which server(Application or webserver) the production link or url is pointing to? Is there any command to get the server IP address? Thanks in advance. (3 Replies)
Discussion started by: venkatesht
3 Replies

7. AIX

server rebooted

Hi, I want to know how to find out which user has rebooted the server? I have used last command but it is not giving username though it is showing below output reboot --------------- date Regards, Manoj (5 Replies)
Discussion started by: manoj.solaris
5 Replies

8. HP-UX

How can we know that the server was rebooted by which user in hp unix

Hi , Plz some one can help me ... How can we know that the server was rebooted by which user in hp unix and linux. Regards Venkata Jeevan (1 Reply)
Discussion started by: jeevanbv
1 Replies

9. Solaris

server rebooted by user

Hi, how can i know who has rebooted the server? even last command is not displaying the user, wheather any way to track the user. (2 Replies)
Discussion started by: manoj.solaris
2 Replies

10. UNIX for Dummies Questions & Answers

How to identify who rebooted the linux server

Hi All, Since server is located at remote place so how to identify which user rebooted the server. Is there any way to identify the user. Thanks in advance, Reg, Bache Gowda (1 Reply)
Discussion started by: bache_gowda
1 Replies
Login or Register to Ask a Question