Check hung process and restart

 
Thread Tools Search this Thread
Homework and Emergencies Emergency UNIX and Linux Support Check hung process and restart
# 8  
Old 05-03-2012
What does the status option do? Can you post the networker stop / start script?
# 9  
Old 05-03-2012
@Scrutinizer

Code:
[root@H99 bin]# /etc/rc.d/init.d/networker start

[root@H99 bin]# /etc/rc.d/init.d/networker status
+--o nsrexecd (10762)

[root@H99 bin]# ps auxw | grep nsrexecd
root     10762  0.1  0.0 219884  8436 ?        Ssl  14:26   0:00 /usr/sbin/nsrexecd
root     11002  0.0  0.0  62924   776 pts/3    S+   14:26   0:00 grep nsrexecd

[root@H99 bin]# ps auxw | grep db2vend
db2s12     807  0.0  0.0 28549844 57192 ?      S    12:10   0:01 db2vend (db2logmgr.meth125S12))
root     11302  0.0  0.0  62928   784 pts/3    S+   14:34   0:00 grep db2vend
db2s12   30835  0.0  0.0 292396 49596 ?        S    11:25   0:00 db2vend (PD Vendor Process - 1)  

[root@H99 bin]# /etc/rc.d/init.d/networker stop
[root@H99 bin]# /etc/rc.d/init.d/networker status
nsr_shutdown: There are currently no running NetWorker processes.

and here's the script

Code:
[root@H99A100 bin]# more /etc/rc.d/init.d/networker
#! /bin/sh

# Copyright (c) 1990-2011, EMC Corporation 

# All rights reserved.

# chkconfig: 35 95 05
# description: EMC Networker. A backup and restoration software package.

### BEGIN INIT INFO
# Provides: networker
# Required-Start: syslog network
# Required-Stop: syslog network
# X-UnitedLinux-Should-Start: portmap
# Should-Start: portmap
# Default-Start: 3 5
# Default-Stop: 0 1 2 6
# Description: EMC Networker. A backup and restoration software package.
### END INIT INFO

case $1 in
    start)
        (echo 'starting NetWorker daemons:') > /dev/console
        LD_LIBRARY_PATH=/usr/lib/nsr/lib64:$LD_LIBRARY_PATH
        export LD_LIBRARY_PATH
        if [ -f /usr/sbin/nsrexecd ]; then
                if [ -f /usr/sbin/NetWorker.clustersvr ]; then
                        if [ -f /nsr.NetWorker.local -o \
                            -h /nsr.NetWorker.local ]; then
                                if [ -h /nsr ]; then
                                        rm -f /nsr
                                        ln -s /nsr.NetWorker.local /nsr
                                fi
                        fi
                fi
                (/usr/sbin/nsrexecd) 2>&1 | /usr/bin/tee /dev/console
                (echo ' nsrexecd') > /dev/console
        fi
        if [ -f /usr/sbin/lgtolmd ]; then
                (/usr/sbin/lgtolmd -p /nsr/lic -n 1) 2>&1 | \
                        /usr/bin/tee /dev/console
                (echo ' lgtolmd') > /dev/console
        fi
        if [ -f /usr/sbin/nsrd -a \
             ! -f /usr/sbin/NetWorker.clustersvr ]; then
                (/usr/sbin/nsrd) 2>&1 | /usr/bin/tee /dev/console
                (echo ' nsrd') > /dev/console
        fi
        ;;
    stop)
        (echo 'stopping NetWorker daemons:') > /dev/console
        if [ -f /usr/sbin/nsr_shutdown ]; then
                if [ -f /usr/sbin/NetWorker.clustersvr ]; then
                        (/usr/sbin/nsr_shutdown -q) 2>&1 | \
                                /usr/bin/tee /dev/console
                        (echo ' nsr_shutdown -q') > /dev/console
                else
                        (/usr/sbin/nsr_shutdown -q) 2>&1 | \
                                /usr/bin/tee /dev/console
                        (echo ' nsr_shutdown -q') > /dev/console
                fi
        fi
        ;;
    status)
        if [ -f /usr/sbin/nsr_shutdown ]; then
                /usr/sbin/nsr_shutdown -l
        fi
        ;;
    *)
        echo "usage: `basename $0` {start|stop|status}"
        ;;
esac


Last edited by vbe; 05-03-2012 at 10:50 AM..
# 10  
Old 05-03-2012
Quote:
Originally Posted by hedkandi
Is it a good idea to stop & start it a regular interval instead?
But how do I check with command if the process is hung?
1. Depends on what the operating characteristics of the program. Drop a stop/start script in cron.daily/ and go from there.

2. Again, how do you characterize when it is hung? That is, what symptoms indicate to you that it is hung?
This User Gave Thanks to otheus For This Post:
# 11  
Old 05-03-2012
I meant the content of the start/stop script...
# 12  
Old 05-03-2012
Quote:
Originally Posted by Scrutinizer
I meant the content of the start/stop script...
I just added it...

---------- Post updated at 01:56 PM ---------- Previous update was at 01:39 PM ----------

Quote:
Originally Posted by otheus
1. Depends on what the operating characteristics of the program. Drop a stop/start script in cron.daily/ and go from there.

2. Again, how do you characterize when it is hung? That is, what symptoms indicate to you that it is hung?
Well, if we look at the screenshot I attached earlier, the process had been running since the 1st of May and it didn't look right when grep'ed because we had expected it to run and complete on the 1st itself, so that is why networker was restarted today around 11amish

I'm with you on the cron job, this idea is beginning to appeal to me more and more and I spoke to the backup chap about it so we will definitely look into implementing this on the 3 servers.
# 13  
Old 05-03-2012
Code:
/etc/rc.d/init.d/networker status

appears to just list the processes. Probably the best thing to do in your script is to just issue a
Code:
/etc/rc.d/init.d/networker stop
/etc/rc.d/init.d/networker start

If that does not work, then perhaps there is a -f option to /usr/sbin/nsr_shutdown. Probably this is better than kill. Consult your manual and/or your Networker support organization.
This User Gave Thanks to Scrutinizer For This Post:
# 14  
Old 05-03-2012
Hi @scrutinizer, otheus

Thank you so much for your help

I am putting this in cronjob

Code:
#!/bin/bash
STOPCMD='service networker stop'
STARTCMD='service networker start'
PROCESS='nsrexecd'

if ps auxw | grep -v grep | grep $PROCESS > /dev/null
then
  echo "`date` Process Networker is running" >>/var/log/messages
else
  echo "`date` Process Networker not running and will be started" >>/var/log/messages
  $STARTCMD
fi
exit

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to check if the process is Hung?

I wish to monitor if my Tomcat process if Running, Hung, or Shutdown. I cannot use any third party monitoring tools so i decided to use one of these to test if the tomcat server is responding or not . 1. nslookup 2. telnet 3. ps The reason I do not use wget / curl is because it will... (2 Replies)
Discussion started by: mohtashims
2 Replies

2. UNIX for Dummies Questions & Answers

View User Mode Call Stack of Hung Process

I have a multithreaded usermode program(actually a daemon) which is in hanged state. To debug it I tried attaching the process to gdb, but the gdb hangs. gstack also gets hanged. I peeped into the proc file system and saw the process to be in sleeping state. /proc/sysrq-trigger I guess... (1 Reply)
Discussion started by: rupeshkp728
1 Replies

3. Red Hat

How to find the process which is caused system hung state?

when system is hung state due to swap, we will reboot it through ILO. i want to know which process caused system hung. (1 Reply)
Discussion started by: Naveen.6025
1 Replies

4. Shell Programming and Scripting

Script to restart process

HI, I am trying to write a scrip which would restart active process. This is what i have written till now. $ xms show pr PRESE.* NAME STATUS ROLE PID RSTRTS STARTED MACHINE... (8 Replies)
Discussion started by: Siddheshk
8 Replies

5. Shell Programming and Scripting

Script to restart a process

I have written a script which checks for a file if that is being updated or not. If the files is not being updated then it will restart the process. #!/bin/sh DATE=`date +%Y%m%d%H%M%S` LOG_FILE=/var/xmp/log/XMP_* INCEPT=`ls -l $LOG_FILE |awk '{print $5}'` PROC=`xms show pr |grep -i... (3 Replies)
Discussion started by: Siddheshk
3 Replies

6. Shell Programming and Scripting

zombie processes and hung process termination

Is there a way I can run a command that will run in the kernel or in the memory and automatically kill certain scripts if they get to <defunct> processes, without having to be monitoring the server manually? I have a Perl script which runs for 20k members and normally does not have any problems,... (2 Replies)
Discussion started by: ukndoit
2 Replies

7. Shell Programming and Scripting

Monitoring for a hung process

A coworker has a shell script that runs from a scheduler at the 3am. The shell script runs sqlplus passing in a sql statement, which generate a file. This is done 21 times for 21 different sql statements. Recently, one of the sqlplus processes got hung. Is there a way to monitor how long the... (2 Replies)
Discussion started by: prismtx
2 Replies

8. SuSE

Restart process

I have a process that gradually eats up memory, it's currently at 80.2% and slowing down the linux server > ps aux | grep SNMPME root 3129 0.0 80.2 3591752 2480700 ? Sl Feb13 5:04 /opt/nampe/lib/snmpme/SNMPME config/startup.xml Is there a command I can execute to restart this... (3 Replies)
Discussion started by: brendan76
3 Replies

9. Shell Programming and Scripting

need help to write script to check the process health and automatically restart it

I am working on a project, which need to constantly watch the process, and check its status, if it was dead, it should be restart automatically. Please kindly refer me to URL which teach how to write this kind of script, or service. Thanks. (1 Reply)
Discussion started by: dragondad
1 Replies

10. UNIX for Dummies Questions & Answers

SCO Openserver 5.0.7 Hung process problem

Hi guys I installed this new server with 5.0.7 openserver and i'm getting a lot of this process, if a stop and restart the printer spooler they go away but after a few minutes they appear again.This is how it looks like. root 372 615 0 - - 00:00:00 <defunct> root ... (0 Replies)
Discussion started by: josramon
0 Replies
Login or Register to Ask a Question