Okay, this one is tricky and I'm not sure there is a niec way to do it, or indeed anyway to do it. The main issue revolves around timing out a hung ssh. I am doing this by creating a wrapper script for the ssh with the following requirements.
My requirements are:
Defineable timeout period
If the timeout period completes and ssh is still running then kill it.
Provide a return code as if the ssh has not been run from a wrapper script.
Multiple instances of this wrapper script can run at the same time.
By point 3 I mean, say the original ssh was
want my wrappered ssh (currently called safe_ssh) to return the return code of the ls not just whether or not the ssh completed.
Sounds simple right? However, what I have found is I can either kill the ssh or get a meaningful return code but trying to do both is neigh impossible (at least with my level of scripting.)
The status so far
I currently have a script that will timeout the ssh but can only return whether I killed the ssh or whether it exited of it's own accord.
The problem
I can get the return code of the ssh if I echo it into a temporary file from the background process and then read that file in the main process. For example something like:
However,
In this case, $ssh_pid is no longer the pid of the ssh itself but the whole background script meaning that I can no longer cleanly kill the ssh as I don't know it's pid.
I need the file to have a unique file name in case there are multiple instances of the script running so that I can read the correct file from the main script. For this I was thinking of including ssh_pid in the file name.
I thought about echoing the ssh PID into the temp file as well but this will not work as the steps of the script to add data to the tmp file will not be executed till the ssh has completed and of course it won't have completed if it has hung, which is the situation in which we want to kill it.
I hope this vaguely makes sense. Sorry it is a bit convoluted. If you need any clarifications please do ask.
Thanks a lot
Robyn
---------- Post updated at 05:38 PM ---------- Previous update was at 04:41 PM ----------
Okay,
I think I have come up with an idea and it is as follows.
set a sleep thread running in the background, (when this sleep thread completes it reads a temporary file for the ssh_pid and kills the ssh.)
set up the ssh thread in the background
echo the pid of the backgroud ssh to the temporary file
wait on the ssh
once the wait is complete kill the sleep thread if it exists.
The temporary file will be named with the parent PID so all child processes can determine what it's name is.
This way:
If the ssh finishes without hanging, the wait will provide the background process return code.
If the ssh hangs there will have been plenty of time to write its PID to the temporary file and hence the sleep thread can kill it when it exits.
I think I can do most of this but wanted to run the idea past you in case there is some obvious flaw I haven't spotted.
ALSO: How do I get the PID of the running process, that is say I call my script safe_ssh, how do I get the PID of safe_ssh from within safe_ssh. I assume it must be straight forward but do not currently know.
Q: ALSO: How do I get the PID of the running process, that is say I call my script safe_ssh, how do I get the PID of safe_ssh from within safe_ssh. I assume it must be straight forward but do not currently know.
A: $$
If I were writing this I wouldn't try to get the return code from ssh. It is quite hard. Why not redirect all the output from ssh to a log file. Then examine the log file for errors, if there are errors, set the return code to non-zero.
---------- Post updated at 12:10 PM ---------- Previous update was at 11:44 AM ----------
Consider these examples:
job should not run for more than 5 seconds
command is: sleep 20
host wpgux00a_sw does not exist.
Consider this code:
Notice the change in logic. Sleep is not in the background. I don't kill the sleep. Simply sleep and then check if ssh is still running.
Or in the remote .profile or .bashrc use TMOUT=n where n is the number of idle seconds before the process gets killed. You probably should set TMOUT as readonly, which is shell dependent.
It was be belief that the timeout only works if the ssh has properly connected, not got stuck somehow. If I am wrong please do correct me.
@jim mcnamara thanks for the suggestion but this is for a fix that will go out to multiple different systems an I need a fix that will be our code rather than just ours an so I don't think changnig the .profile file is a possibility.
Hi Team,
i am executing 3 scripts in background from 1 script and i want to send a message once the script gets completed.these scripts usually takes 1 hr to complete.
My sample script is below,
Vi abc.sh
sh /opt/data/Split_1.sh &
sh /opt/data/Split_2.sh &
sh /opt/data/Split_3.sh &
... (3 Replies)
Hi,
I have created a function f1 defined in script A.sh .I have called this function in background . But I want to use its return value for another function f2 in script A.sh.
I tried declaring it as a global variable, yet it always returns the status as 0. Is there any way with which I can get... (7 Replies)
Hi All,
I was out not working on unix from quite sometime and came back recently. I would really appreciate a help on one of the issue I am facing....
I am trying to kick off the CodeNameProcess.sh in PARALLEL for all the available codes. The script runs fine in parallel.
Let say there are... (1 Reply)
Hi all, i hava a specific backgroud process. I have de PID of this process. At some time, the process finish his job, is there any way to catch the exit code? I use "echo $?" normally for commands.
Thanks! (2 Replies)
Hello All,
I was looking into creating a script that would be used only to start a Daemon and create a lock file...
F.Y.I. It's for Nagios' NRPE Daemon Plugin...
Anyway when I run the command to start the Daemon (below):
/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
And... (14 Replies)
Hi all,
I'm reading <advanced bash scripting> and there is a example to kill a background process in a limited time,as shown below:
#! /bin/bash
#set -n
TIMEOUT=$1
count=0
hanging_jobs & {
while ((count < TIMEOUT));do
eval ' && ((count = TIMEOUT))'
((count++))
sleep 1... (6 Replies)
Hi,
I have a process that I run in the background that looks like this
${BASEDIR}/ksh/sqler.ksh ${compnames003} &
and I would like to get the return code of the sqler.ksh script.
so my code is like this
${BASEDIR}/ksh/sqler.ksh ${compnames003} &
retcode=$? (3 Replies)
Hello All,
I am a newbie in Shell script programming, and maybe you can help me with my query.
I need to write a shell script (mntServer.ksh) that will start a background process and also to be able to run another script.
The mntServer.ksh script contains:
#!/bin/ksh... (1 Reply)
Hi
I have the following piece of code that is calling another child process archive.ksh in the background
while read file;
do
file_name=`ls $file`;
ksh archive.ksh $file_name &;
done < $indirect_file
The problem is, indirect_file may contain anwhere from 2 to 20 different... (5 Replies)
I am having a problem getting the PID of a process I start in the background is a csh.
In tcsh and sh it's simple
$! give it to you
But in csh this just returns
Variable syntax
From the man page it should work but it doesn't????
Any help. (2 Replies)