Thanks very much guys
That's the kind of stuff I have been looking for.
I'll certainly play around with the code posted and I wasn't aware of the select command, I assumed it was something low level in the kernal maybe
I still don't know what the file descriptors are, as since running under strace it hasn't failed. But if it does I should be able to discover where the problem is.
I don't know if the -n will help with this version of the harness. But if things get bad I'll try it.
The harness uses ssh multiplexing to add the ssh connections to NIX domain sockets and then echos stdout from the remote server into a local report file in real time.
The new version I'm working on uses a messaging framework to send a report file so shouldn't have any issues like this.
I hope
Thanks for the feed back, really helpful stuff.
---------- Post updated at 05:37 PM ---------- Previous update was at 09:50 AM ----------
That's really interesting MIG
I wasn't aware of that behaviour before.
Thanks
---------- Post updated 03-07-14 at 02:18 PM ---------- Previous update was 02-07-14 at 05:37 PM ----------
Thanks to all of you for your help.
The job eventually failed again and I was able to trace the problem to a hanging write to a named pipe that the client process had died on.
I've put in a rather ugly fix that does the write in the background and then kills $! if it is still running.
It would be better if I could test if a process is listening on the end of the pipe but I haven't found a way of doing so.
Does anyone know if that can be done?
Thanks again for all the helpful input