Hi all,
I'm writing a script that chooses the best computer available in an open lab. The script works great except every now and then there is a dead computer in the lab that begins the ssh handshaking, but freezes after the following:
debug1: Offering public key:
When the script happens accross such a computer, it hangs up waiting for the ssh command to finish.
I'm trying to come up with a way to implement a timeout criteria that stops the local attempt to wait for the ssh command after a set time interval. Not having found an option in ssh itself that does this, I'm trying to achomplish the task by putting the ssh in the background, like this (I'm using bash):
ssh lab4-5 'uptime' > test_load &
If I enter the above line in a terminal, it runs in the background with an expected Pid and "Done" message. however, if it is part of a script file, there are no such messages printed, nor does a "jobs" command indicate what the Pid might be.
The clossest I've come to the script working is the following:
#!/bin/bash
timeout=2
tstart=$( date | sed -r -e 's/.*:.*:0?//' -e 's/ .*//' )
[ -e test1 ] && rm test1
{
ssh lab4-5 'uptime' >| test2
mv test2 test1
} &
while [ ! -e test1 ]; do
tnow=$( date | sed -r -e 's/.*:.*:0?//' -e 's/ .*//' )
lapse=$(( tnow - tstart )) && [ "$lapse" -lt "0" ] && lapse=$(( 60 + lapse ))
echo "lapse: $lapse"
[ "$lapse" -gt "$timeout" ] && kill %+ && echo "ssh killed"
[ "$lapse" -gt "1" ] && jobs
done
cat test1
One question I have is what outputs & should be sending its messages to. Can these be captured by the script? Any suggestions on how to get this to work are greatly appreciated. Also, if there are any ideas about a better way to go about this, I'm open to suggestions.
Thanks