Visit Our UNIX and Linux User Community

waitpid and grandchildren

Thread Tools Search this Thread
Top Forums Programming waitpid and grandchildren
# 1  
Old 05-15-2012
Question waitpid and grandchildren

I'm attempting to write a daemon that will start, stop, and monitor processes across a network of servers, meaning that a daemon would start on each server, attempt to connect to siblings at regular intervals (if there are unconnected siblings), and start services as remote dependencies are resolved.

This has been working fairly well thus far... I've utilized OpenSSL's PKI infrastructure to authenticate via public/private key pairs signed by a trusted CA (so only authorized clients can start/stop/monitor remote processes), and I've been able to track the PID of processes even after they fork (utilizing ptrace, I think similarly to the way the 'upstart' project works, though I haven't really had the chance to look at their code yet). What I'm running into a roadblock on, is process termination.

Basically, to follow a daemon's fork, my daemon will fork and exec the service daemon in question with a PTRACE_TRACEME, wait for a TRAP signal, and set the PTRACE_O_TRACEFORK | PTRACE_O_TRACEVFORK | PTRACE_O_TRACECLONE flags. It would then wait for another trap and check the cause by checking if:
((status >> 16) & 0xffff) == PTRACE_EVENT_FORK
((status >> 16) & 0xffff) == PTRACE_EVENT_VFORK
((statux >> 16) & 0xffff) == PTRACE_EVENT_CLONE

, and get the PID from the child with a PTRACE_GETEVENTMSG in those cases. This has been working beautifully. Once I get the final PID of the daemon in question, I detach from it with a PTRACE_DETACH and let it run unhindered. All good and well... but...

My daemon selects on a signal socket (via the signalfd function), listening for the SIGCHLD signal (which is supposed to be sent whenever a child terminates). This gets triggered, and I go into a series of waitpid(grandchild, &status, WNOHANG) calls on each monitored process to determine which one just terminated... but I get a "No child processes" thrown at me whenever I wait on a grandchild. I assume this worked before because I was using ptrace, and ptrace was attached... once I detached, the service daemon's original process became the parent, which then died, and then I am guessing the init process became the parent... meaning I probably can't use this approach at all because init becomes the parent.

I guess the bottom line... can I monitor a given grandchild process without becoming the init process (meaning I would need to write a replacement for init) ? I simply want to be able to detect when a grandchild terminates, so I can propogate that state across to sibling daemons across the network so they can react accordingly.

If I truly can't accomplish this without writing a replacement for init, is there some documentation somewhere on what exactly init needs to handle so I can write a proper replacement? I understand this is what the upstart project is doing, I just wish it had the capability to handle remote services.

EDIT: I've found this from Ubuntu about replacing init... they suggest starting from the source code for SysV init. I don't think I can do that... I'm pretty set on making my code BSD licensed, and IIRC, SysV is either GPL or some other form of incompatible license (please correct me if I'm wrong). I'd like to make sure this is as portable as possible, and I realize I may have to go about an alternative implementation of ptrace to follow forks if I want to be portable.

Last edited by kshots; 05-15-2012 at 06:15 PM.. Reason: Found more information
# 2  
Old 05-18-2012
Once I get the final PID of the daemon in question, I detach from it with a PTRACE_DETACH and let it run unhindered.
My understanding is that once you detach a process, you no longer get signals like SIGCHLD on its behalf.

But the idea of using ptrace for this kind of thing seems novel to me.

can I monitor a given grandchild process without becoming the init process
. DJ Bernstein's Daemontools offers a solution. You leave a file-descriptor open to the grandparent (so your monitoring program never truly detaches).
# 3  
Old 05-18-2012
Hmm... actually, I just found that a solution should present itself in the 3.4 kernel, when it comes out. I can't post a URL with where I got this from (because apparently I need 5 posts for that), but here's a quote from the API changes page:
The PR_SET_CHILD_SUBREAPER prctl() operation allows a "service manager" process to mark itself as a sort of 'sub-init', able to stay as the parent for all orphaned processes created by the started services. All SIGCHLD signals will be delivered to the service manager. There is a corresponding PR_GET_CHILD_SUBREAPER prctl() operation. Planned users of this feature include D-Bus and systemd.
This User Gave Thanks to kshots For This Post:
# 4  
Old 05-21-2012
That's very cool.

Previous Thread | Next Thread
Test Your Knowledge in Computers #150
Difficulty: Easy
Oracle discontinued OpenSolaris after their acquisition of Sun Microsystems.
True or False?

3 More Discussions You Might Find Interesting

1. Programming

[C]Fork and waitpid

Hi folks, I am writing a simple program to understand how fork() and waitpid works, but it doesn't seem that is working like I wanted. if(fork()==0){ //el hijo pid1=getpid(); printf("\nSoy el hijo %d",pid1); }else { //el padre if (fork()==0) { //el hijo pid2=getpid();... (2 Replies)
Discussion started by: lamachejo
2 Replies

2. Programming

waiting for multiple childs - C - waitpid

Hi gurus, I would like to fork more children and then write their return values: so far I tried: #include <stdio.h> #include <stdlib.h> #include <errno.h> #include <unistd.h> #include <sys/types.h> #include <sys/wait.h> int main(void) { pid_t pid; int rv=0, i; ... (5 Replies)
Discussion started by: wakatana
5 Replies

3. Programming

problems with FORK() and WAITPID()

Dear All, I'm trying to write multithreading TCP Daemon which executes external program when new network connection arrives on a socket. After accept() I'm doing fork() for initiating of new child process, in which will be executed external program. After child creation I'm doing fork() again,... (3 Replies)
Discussion started by: Polkovnik
3 Replies