[Solved] Unable to mailx new $pid for a script restart


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting [Solved] Unable to mailx new $pid for a script restart
# 1  
Old 06-06-2013
[Solved] Unable to mailx new $pid for a script restart

Ill try to make this brief:

I am trying to get the script below to run another script defined as BATNAM.
The script runs fine, does what i designed it to do, however...

I would like it to mailx the NEW $pid that was restarted.

This script is supposed to go in crontab as root, and run by the min checking if $pid exists.

Lastly, should i try a diff approach, maybe a "while" or "until"? rather than "if/then/else"

I have tried sleep 10, as it does take more than 7 seconds for the script to show up in ps -ef|grep myusername.

could you pls review the code below, and make suggestions.
This is a simple syntax issue and I am unable to find it.

I am on hpux / /usr/bin/sh, and I have root if necessary.

thanks!
Code:
#!/usr/bin/sh -x
# set the locals
stty intr '^c'

# set the vars
BATDIR="/usr/script8/batch"             # batch dir
BATNAM="bat_fstsi61c.sh"                # batch process file
BATPF="bat_fstsi61.pf"                  # batch to grep
BATSVC="SIGTEST"                        # mail topic
BATPF="bat_fstsi61.pf"                  # batch to grep
SERVER="PHANTOM"                        # servername here
SDIR="/apps/sigmon/dvl/fst61"           # sigmon dir
SIN="$SDIR/in"                          # in dir
SPROC="$SDIR/proc"                      # proc dir
SLOG="/usr/script8/batch/LOGS"          # log dir
SMAIL="/usr/script8/batch/EMAILS"       # email log dir
SMAILER="petey"                         # person(s) to email

# export the vars
export BATDIR BATNAM BATSVC BATPF SERVER SDIR SIN \
SPROC SLOG SMAIL SMAILER

pid=`ps -ef|grep "$BATPF" |grep -v grep |awk -F" " '{print $2}'`
 echo $pid

if [ "$pid" = "" ]
 then
   /usr/bin/sh $BATDIR/$BATNAM
   sleep 10
   mailx -s "${SERVER} ${BATNAM} restarted" petey
   sleep 10
   echo "$pid" > $SLOG/"$BATNAM-restart-on--`date +%F-%T`"
 else
   echo "service is ok"
   pid=""
fi


Last edited by olyanderson; 06-06-2013 at 04:57 PM.. Reason: got it, thanks. ill use from now one, im new here
# 2  
Old 06-06-2013
The section (if block) where you call mailx is on the condition that $pid is zero length.
So you cannot send an "empty" pid and be able to read it. So, help me here. What are you trying to do?
# 3  
Old 06-06-2013
HP

thanks for the response, i appreciate it.

welp the first thing is, i check if the other batch is running,

if it is NOT running $pid="" (empty) which means there is no pid running with that grep, THEN, restart the process, then exit.

If it is running, the current or new pid started moments ago with above NOT statement, THEN, simply do nothing, echo "running", no log needed.

If it is NOT running, restart, grab the NEW $pid and email that new pid with mailx, AND echo the restarted to the log file.

everything is a go, perms wise, this script restarts the dead script. but it hangs, i want it to ALSO exit out with zero exit status and recheck via crontab every 1 min. this is a HA server, and needs constant babying lol. it doesn't cry much, but when it does, people lose jobs.

thanks for your help again, i appreciate it.
anymore input, just ask Smilie
# 4  
Old 06-06-2013
So, does:
Code:
/usr/bin/sh $BATDIR/$BATNAM

just restart the process asynchronously and exit, or is it supposed to run forever? Does the trace of the script produced by sh -x show that the two sleeps, mailx, date, and echo are run when the batch is not running? Should:
Code:
   /usr/bin/sh $BATDIR/$BATNAM
   sleep 10
   mailx -s "${SERVER} ${BATNAM} restarted" petey
   sleep 10
   echo "$pid" > $SLOG/"$BATNAM-restart-on--`date +%F-%T`"

be changed to:
Code:
   /usr/bin/sh $BATDIR/$BATNAM&
   pid=$!
   mailx -s "${SERVER} ${BATNAM} restarted" petey
   echo "$pid" > $SLOG/"$BATNAM-restart-on--`date +%F-%T`"

Why does your script initialize BATPF twice?

Why bother setting pid to an empty string just before exiting when the service is running?

Last edited by Don Cragun; 06-06-2013 at 09:20 PM.. Reason: fix auto spell correction glitches
# 5  
Old 06-06-2013
i copied this script from elsewhere, i am learning. i am glad you gave some input. thanks. to answer your questions to help:

1. Why does your script initialize BATPF twice?
it doesn't, typo, it will be removed. thanks for noticing that DON!

2. yes it does start a process - another script, a batch script actually that is suppose to run and accept requests to a client driven db - we won't get into that.

3. Why bother setting pid to an empty string just before exiting when the service is running?
I didn't, I do not know what I am doing. I'm trying though.

The sleep is there in the hopes that it will produce a new $pid, that is one thing that is puzzling me the most.

Like I said before, the script runs fine ( the one this is calling ) and needs to be online 24/7. This script is supposed to CHECK to see if ANY pid matches the grep in BATPF. it takes 10 seconds for a new pid to be reproduced. I am trying to wait/sleep/capture that new pid in the "if".

Is there a better way?

Don I will try to do the steps you suggested. Looks like it will work. But remember, I need this script to end. so running it in background process with & looks great, but will take 10 seconds for that new pid to come avail? If I start the backend script, it takes a few mins to produce a pid(the backend is a db starting with a client) and usually takes same amount of time for pid via ps and grep to show its not there - aka gracefully shutdown that connection. DB is heavy transaction based.

thanks, hope that helps. pls ask anything more. thanks guys...
# 6  
Old 06-07-2013
Quote:
Originally Posted by olyanderson
i copied this script from elsewhere, i am learning. i am glad you gave some input. thanks. to answer your questions to help:

1. Why does your script initialize BATPF twice?
it doesn't, typo, it will be removed. thanks for noticing that DON!
You're welcome.
Quote:
Originally Posted by olyanderson
2. yes it does start a process - another script, a batch script actually that is suppose to run and accept requests to a client driven db - we won't get into that.
If you want to fix the problem, we need to get into that. The changes I suggested should work if the script starts the batch script and waits for it to complete. If you would have shown us the trace output produced by /usr/bin/sh -x, we might be able to give you a definitive answer; without seeing that output or seeing what is in /usr/script8/batch/bat_fstsi61c.sh, we can only make wilde guesses (like I did before).

If /usr/script8/batch/bat_fstsi61c.sh is the batch script and it doesn't return until it is killed, what I suggested should work. If /usr/script8/batch/bat_fstsi61c.sh asynchronously starts the batch script and returns without waiting for it to complete, what I suggested will not work. In that case you'll need to rerun the ps pipeline to reset pid after the batch process is restarted. You wait 20 seconds after /usr/script8/batch/bat_fstsi61c.sh returns (if it returns) before saving $pid in your log file, but you haven't reset pid so you know it has to be an empty string whether or not the batch script restarted successfully.
Quote:
Originally Posted by olyanderson
3. Why bother setting pid to an empty string just before exiting when the service is running?
I didn't, I do not know what I am doing. I'm trying though.

The sleep is there in the hopes that it will produce a new $pid, that is one thing that is puzzling me the most.

Like I said before, the script runs fine ( the one this is calling ) and needs to be online 24/7. This script is supposed to CHECK to see if ANY pid matches the grep in BATPF. it takes 10 seconds for a new pid to be reproduced. I am trying to wait/sleep/capture that new pid in the "if".

Is there a better way?
There are two scripts. You are talking as though there is only one. Until you understand that there are two and how they interact, we're lost. If you actually mean that the script above is the contents of /usr/script8/batch/bat_fstsi61c.sh, then your description of what is going on is extremely confusing and could be rewritten in a much simpler fashion.
Quote:
Originally Posted by olyanderson
Don I will try to do the steps you suggested. Looks like it will work. But remember, I need this script to end. so running it in background process with & looks great, but will take 10 seconds for that new pid to come avail? If I start the backend script, it takes a few mins to produce a pid(the backend is a db starting with a client) and usually takes same amount of time for pid via ps and grep to show its not there - aka gracefully shutdown that connection. DB is heavy transaction based.

thanks, hope that helps. pls ask anything more. thanks guys...
If the script isn't ending when it restarts the batch process, my guess still fits the data you're seeing. But until you show us the trace output you got from running this script or show us the contents of /usr/script8/batch/bat_fstsi61c.sh, we are just guessing.
# 7  
Old 06-07-2013
opps, double post, removed... for obvious reasons

---------- Post updated at 09:55 AM ---------- Previous update was at 09:47 AM ----------

well said don, thanks. i like your style.
here is a rewrite:

I have a script that will not show the current pid.

the script below uses a simple if statement.

the if statement is suppose to restart another process (it does!)

once the 2 script is restarted ( it stopped, this script restarts it)
it produces a pid.

it did NOT have a pid, thus needed to be restarted.
This script below does that.

problem: i can not grep/grab/whatever you want to call it, and mailx that new pid.

that's my problem: can NOT grab the new pid.

I am at work now, i will try your suggestion:
pid=$!

***sh -x stdout BEFORE i kill pid to simulate it's not running***



Code:
$ sh -x sig-ps-restart7.sh
+ stty intr ^c
+ BATDIR=/usr/script8/batch
+ BATNAM=bat_fstsi61c.sh
+ BATPF=bat_fstsi61.pf
+ BATSVC=SIGTEST
+ BATPF=bat_fstsi61.pf
+ SERVER=PHANTOM
+ SDIR=/apps/sigmon/dvl/fst61
+ SIN=/apps/sigmon/dvl/fst61/in
+ SPROC=/apps/sigmon/dvl/fst61/proc
+ SLOG=/usr/script8/batch/LOGS
+ SMAIL=/usr/script8/batch/EMAILS
+ SMAILER=petey
+ export BATDIR BATNAM BATSVC BATPF SERVER SDIR SIN SPROC SLOG SMAIL SMAILER
+ + ps -ef
+ grep bat_fstsi61.pf
+ grep -v grep
+ awk -F  {print $2}
pid=20599
+ echo 20599
20599
+ [ 20599 =  ]
+ echo service is ok
service is ok
+ pid=

*** sh -x stdout AFTER i kill pid to simulate it's not running ***

first i kill the current pid that is already running prior:


Code:
kill 20599

Code:
$ sh -x sig-ps-restart7.sh
+ stty intr ^c
+ BATDIR=/usr/script8/batch
+ BATNAM=bat_fstsi61c.sh
+ BATPF=bat_fstsi61.pf
+ BATSVC=SIGTEST
+ BATPF=bat_fstsi61.pf
+ SERVER=PHANTOM
+ SDIR=/apps/sigmon/dvl/fst61
+ SIN=/apps/sigmon/dvl/fst61/in
+ SPROC=/apps/sigmon/dvl/fst61/proc
+ SLOG=/usr/script8/batch/LOGS
+ SMAIL=/usr/script8/batch/EMAILS
+ SMAILER=petey
+ export BATDIR BATNAM BATSVC BATPF SERVER SDIR SIN SPROC SLOG SMAIL SMAILER
+ + ps -ef
+ grep bat_fstsi61.pf
+ grep -v grep
+ awk -F  {print $2}
pid=
+ echo

+ [  =  ]
+ cd /usr/script8/batch
+ pid=3951
+ mailx -s 'PHANTOM' 'bat_fstsi61c.sh' PID '3951' was started petey
+ . bat_fstsi61c.sh
trunc'd......
+ exec /apps/dlc/bin/our-data-base-here -b -p /tmp/job003954 -pf /usr/script8/pf_files/bat_fstsi61.pf

it appears to work for the pid part thanks!!!

BUT... it now needs to close: here is the ps -ef|grep petey


Code:
petey  3952  3944  0 09:29:49 pts/tt    0:00 mailx -s 'PHANTOM' 'bat_fstsi61c.sh' PID '3951' was started petey
  petey  3944 29979  0 09:29:49 pts/tt    0:00 sh -x sig-ps-restart7.sh
  petey 24556 24555  0  Jun  4  pts/tr    0:00 -sh
  petey  3951  3944  0 09:29:49 pts/tt    0:02 /apps/dlc/bin/our-data-base-here -b -p /tmp/job003954 -pf /usr/script8/pf_files/bat_fstsi61.pf

so now, this job will be put into a cron job, IF i we can make the mailx actually finish.. it seems to be sitting there ...

sorry had to leave other script for security reasons.. hope you understand...

Last edited by Don Cragun; 06-07-2013 at 02:58 PM.. Reason: Added missing newlines to code segments.
Login or Register to Ask a Question

Previous Thread | Next Thread

7 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Unable to get pid from fuser

bash-3.2$ fuser -f /bin/nohup.out /bin/nohup.out: 13136o 13111o The pid is 13136. Can you tell me how can i extract just the pid 13136 from the above output ? bash-3.2$ uname -a SunOS mymac 5.10 Generic_150400-26 sun4v sparc sun4v I was trying on this lines but i get strange... (3 Replies)
Discussion started by: mohtashims
3 Replies

2. Red Hat

Can't stop/restart postfix - pid associated with another process

This issue could happen to any other service but in this case its commssioning Postfix - it seems i can't stop postfix as the PID relates to another service - i've delete the 'master.lock' but to no available - any ideas, memeory commands etc ? thanks in advance ps. the serve is in Production so... (2 Replies)
Discussion started by: stevie_velvet
2 Replies

3. Shell Programming and Scripting

[Solved] Unable to call a python script from bash

Hi, I am trying to run a python script embedded in bash script. But is throwing me an error. Please help. Script: #!/bin/bash nohup /usr/bin/python /opt/web/http.py & Error: /usr/bin/python: can't open file '/opt/web/http.py': No such file or directory Please help me on this. (6 Replies)
Discussion started by: maddy26615
6 Replies

4. HP-UX

[Solved] mailx : unknown user issue

Hi all, I know this issues has been discussed multiple times, i have gone through many such discussion but unfortunately i am still not able to solve the issue being faced. I have configured the sendmail.cf with the smtp host name (Editing the entry starting with DS...) Post that restarted... (7 Replies)
Discussion started by: chpsam
7 Replies

5. Solaris

Unable to send mail through mailx

Hi, I am using solaris 5.9 OS and I am facing an issues with mailx. My SMTP port is configured to listen 6190 and not the default one which is 25. I can send mail to my inbox when i do it manually through the following steps root@<dbname> # telnet 15.12.88.10 6190 Trying 15.12.88.10...... (0 Replies)
Discussion started by: Srinathkiru
0 Replies

6. Linux

mailx: Unable to send Japanese

Hi All, I am facing problem in sending Japanese characters using mailx command in GNU linux machine. The mail is going with junk characters like "メールの-界へようこそ". I tried changing the LANG value to japan locale off UTF-8. But it doesn't worked. I have to send the data as body not as an... (0 Replies)
Discussion started by: Karteek
0 Replies

7. Shell Programming and Scripting

unable to do mailx from shell script

Hi From within a shell script my mailx doesnt seem to work...can somebody tell me what is wrong... #!/bin/ksh #Script to verify wheather all databases listed are up and running #Script works with Oracle8 and above databases #Script has to be run by ./scriptname DBA=xiamin@unix.com echo... (3 Replies)
Discussion started by: xiamin
3 Replies
Login or Register to Ask a Question