We've been having some problems with a specific program in our nightly processing, so I whipped up a little script to run to monitor it, and send an e-mail when it's complete (failure or not). My primary problem is that I cannot modify the binary or the script that calls it, since the developers probably wouldn't be too happy about that. So it has to be stand-alone. My goal is a reliable script, fairly simple, and above all, very lightweight (I don't want to use any more cycles than are absolutely neccessary). Here's what I have running right now:
I want to be able to let other night operators run this during the weekend (that's why I wrote the instruction in the mail). Also, since none of them are very Unix literate (we only have a few Unix servers around - we're mostly NT), I wanted to make it simple to run. Just type name_of_script, and it'll background itself...
My question is: Can this be written to take even less resources?
Oh yeah, BTW, in case people are wondering:
This is a midrange HP-UX server in a key production environment...
Hmm... Well, the first script works, yet I tried to create another script with a different name, that watches a different process. Each time I try to run it, I get this error:
Then it hangs until I ^C...
I'm not trying to create any file expressly in the script. I also can't figure out that number... The very next command I ran was echo $$, and I got 24704 as my process number.
I also tried the set -x at the top of the script, but all it is pointing to is something in check_it.
I also tried adding a bunch of echo statements to see what's happening, but I can't see what's trying to write the file.
The best guess I have is mail trying to write a temp file. But in the case, why is one script working, and an almost exact copy isn't?
(btw, I have checked /var/tmp, and it doesn't contain any file named sh*)
It's probably something simple, but I'm too frazzled to figure it out right now...
Put the echo $$ inside the script. It's the script's pid that counts, not the login shell's pid.
That file is being created by the script itself, not the mail program. Is one script root and the other non-root? Check that /var/tmp is writable by everyone. Check that /var/tmp has free space and free inodes. Somehow that second script cannot write to /var/tmp. At least that would be my first guess.
After a little more investigation, it looks like we're missing a patch / using a broken shell. I found this info for HP-UX 10 system... We're using 11.00 though... I guess I have to check with the Unix admin.
PHCO_16063:
1) Posix shell removes heredoc temporary files
before they are read. When scripts like the
following are executed, we see messages like
"/tmp/sh3737.2: Cannot find or open the file."
I tried changing the shell to /usr/bin/ksh, and got a similar error, except it said it was in the bail_out function (above)... that pretty much limits it to mail or the heredoc problem...
I'm going to look into this some more, and try to figure out why one script works well, but another similar one doesn't (on a VERY consistant basis)...
(BTW, I can create files in /tmp and /var/tmp, there is plently of space / inodes...)
Hi,
I need to grep a pattern in the log file of a process and send a mail if pattern found.But I am not able to figure out how do I detect when the process comes UP,it is started several times a day and each time it is started I need to perform this action. Please suggest something. (3 Replies)
Hi all,
Tearing my hair out..!
I have a requirement to monitor and restart a unix process via a simple watchdog script.
I have the following 3 scripts that dont work for me..
script 1 (only produces 1 output if process is up or not)... (4 Replies)
Hi,
I need help to monitoring a process using the shell script
The same output is below
oracle 32578 32577 0 Feb27 ? 00:06:47 java -cp .:lib/ant.jar:lib/ojdbc5.jar:lib/log4j-1.2.17.jar:/ORACLE_HOME/server/lib/wlfullclient.jar:/ORACLE _HOME/server/lib/weblogic.jar:Alerts.jar... (9 Replies)
Hi,
I have written a script to monitor a Process with the help of top command. This is my script.
======================
#!/bin/sh
DATE=`date +%Y%m%d%H%M%S`
HOME=/home/xmp/testing/xmp_report
RADIUS_PID=`xms -xmp sh pr | grep "RADIUS.iamsp02ldv" |awk '{ print $3 }'`
PSE_PID=`xms -xmp sh... (5 Replies)
get email notification from from system when a process from XXXX user takes longer than 15 min run.Let me know the time estimation for the same.
hi ,any one please tell me , how to write a script to get email notification from system when a process from as mentioned above a xxxx user takes... (1 Reply)
hi,
I need to change the code such that it becomes configurable to send email or sms or both.
At the moment the code works like sending both email and sms for any alert now want to change it to send email/sms as per my demand.
1. Like for a particular alert I only want email
2. If the alert... (2 Replies)
Hi,
I need to monitor the memory usage of a particular process continuously. As of now I am using the following command:
ps -fu <user name> -o pid,comm,vsz | grep <process_name> | grep -v grep
The output of this command gives me what i need except i want the output to keep getting updated... (3 Replies)
Hello all,
I would be happy if any one could help me with a shell script that would determine all the processes running on a Unix server and post a mail if any of the process is not running or aborted.
Thanks in advance
Regards,
pradeep kulkarni.
:mad: (13 Replies)
hi all
I am running a script monitor using source command.
the shell script monitor is used to execute a pl/sql procedure.
when i do
source monitor
and then
ps -ef | grep <procedure name>
i do not get any info
but when i do
sh monitor
and then
ps -ef | grep <procedure name>
i... (8 Replies)
I would like to know if i can monitor if a process is running.
I have one program wich is running all the time, called oliba, but sometimes it goes down, and I have to launch it again.
Is there a way to monitor the pid of the program, and if the program goes down, to lauch it again?
Can you give... (3 Replies)