Unix signal processing

Introduction

The signal() system call is inconsistent, unreliable and deprecated. It has been replaced by sigaction(), which is standardised, robust but also more complicated.

sigaction() is in a group of system and standard library calls that use or manipulate sets of signals (“sigsets”):

  • sigaction():  set or clear signal handler
  • sigprocmask(), sigsuspend(): block or unblock signals
  • sigaddset(), sigdelset(), …: manipulate sigsets

We refer to this group of system and standard library calls as “the sigset ecosystem”.

I learned about the sigset ecosystem during a recent project that launched a lot of child processes and I thought that other people, particularly those of the C/pre-Linux generation like myself, might benefit from what I learned.

This article covers:

  1. the fragmented history of signal()
  2. a sanity check: can you save source code from this page and compile it and run it?
  3. old-school signal processing in C: a series of examples leading from the familiar to the unreliable
  4. reliable signal processing in C
  5. reliable signal processing in Perl
  6. old-school signal processing in Python: issues with asynchrony
  7. reliable signal processing in Python
  8. some conclusions

There are a lot of signals (see signal(7) or run kill -l for a list) but this article is concerned with handling only a few of them:

  1. SIGINT: when the controlling terminal detects that <CTRL-C> was pressed, then this signal is sent to the foreground process and all processes in its process group
  2. SIGTERM: the default signal sent by the kill command; it should be interpreted by the receiving process as “please commit suicide but clean up before you do”
  3. SIGCHLD: when a child process exits (for whatever reason), then the kernel sends this signal to the parent process
  4. SIGALRM: when a process calls alarm(secs) then the kernel sends this signal to the same process secs seconds later
  5. SIGUSR1: one of two signals available to illicit user-defined behaviour in a process

Regarding my coding style:

  1. many of the example C programs below contain empty comment blocks like this:
    /* 
     *  Type and struct definitions
     */
    
    /*
     *  Global variables
     */

    As well as being orderly, these provide diff or meld (or whatever diffing tool you might use) with more synchronisation points so these tools can do a better job of aligning the contents of their file arguments, which means that you can more easily identify the differences between one source file in this article and the next.

  2. in order to keep the example source codes as uncluttered as possible: return codes are rarely checked; there is no protection against buffer overflows; type casting is rarely made explicitly; functions and variables that could be static are generally global; signal handlers handle only one signal type (even though they could handle multiple types).
  3. shell sessions for compiling and running programs show: my shell prompt (lagane$), input in bold, output in roman (i.e. not bold); additional newlines may have been added in order to make output more readable.

Finally, if you see a mistake in this article, then please let me know. Thanks!

The fragmented history of signal()

The Linux signal(2) man page states:

In the original UNIX systems, when a handler that was established using signal() was invoked by the delivery of a signal, the disposition of the signal would be reset to SIG_DFL, and the system did not block delivery of further instances of the signal. …

System V also provides these semantics for signal(). This was bad because the signal might be delivered again before the handler had a chance to reestablish itself. Furthermore, rapid deliveries of the same signal could result in recursive invocations of the handler.

BSD improved on this situation, but unfortunately also changed the semantics of the existing signal() interface while doing so. On BSD, when a signal handler is invoked, the signal disposition is not reset, and further instances of the signal are blocked from being delivered while the handler is executing. Furthermore, certain blocking system calls are automatically restarted if interrupted by a signal handler (see signal(7)). The BSD semantics are equivalent to calling sigaction(2) with the following flags:

sa.sa_flags = SA_RESTART;

… The [Linux] kernel’s signal() system call provides System V semantics.

… the [Linux] signal() wrapper function [i.e. not the kernel’s signal() system call] does not invoke the kernel system call. Instead, it [supplies] BSD semantics.

If a system call is “automatically restarted”, it effectively becomes a wrapper to the real system call like this:

int system_call_x(...)
{
    while ((rc=the_real_system_call_x(...)) == ERROR && errno == EINTR)
        ;
    return(rc)
}

The points I wanted to illustrate with that pseudocode are:

  1. calls to such system calls do not return on interrupt
  2. but they may be preempted by calls to signal handlers because they are “less atomic” than wrapper-less system call (e.g. nanosleep())

Regarding which blocking system calls behave like this and when, the Linux signal(7) man page states:

If a signal handler is invoked while a system call or library function call is blocked, then either:

  • the call is automatically restarted after the signal handler returns; or
  • the call fails with the error EINTR.

Which of these two behaviors occurs depends on the interface and whether or not the signal handler was established using the SA_RESTART flag … The details vary across UNIX systems; below, the details for Linux.

If a blocked call to one of the following interfaces is interrupted by a signal handler, then the call will be automatically restarted after the signal handler returns if the SA_RESTART flag was used; otherwise the call will fail with the error EINTR:

  • read(2), readv(2), write(2), writev(2), and ioctl(2) calls on “slow” devices.

Related to that last point:

  • it also applies to some other system calls: e.g. accept()
  • it does not apply to the standard library call sleep() or its underlying system call nanosleep()
  • in the past it applied to the standard library call fgets() but that is no longer the case

A sanity check

Before we launch into a series of examples, check that you can download a test program, compile it and run it.

  1. If you have Subversion installed then:
    1. Retrieve all the sources as follows:
      lagane$ svn co https://svn.pasta.freemyip.com/main/wordpress/trunk/computing/articles/unix-signal-processing
      lagane$ cd unix-signal-processing
      lagane$
    2. Note that, while following this article, you may ignore the instructions to copy and paste other source code into local files.
  2. If you do not have Subversion installed then:
    1. Copy and paste this source code into funcs.c:
      #include <stdarg.h>             /* for va_start(), va_end() */
      #include <stdio.h>              /* for sprintf(), stderr, ... */
      #include <stdlib.h>             /* for exit() */
      #include <time.h>               /* for clock_gettime() */
      #include "funcs.h"              /* for infomsg(), errormsg() */
      
      /*
       *  Macros
       */
      
      /*
       *  Global variables
       */
      
      /*
       *  Forward declarations
       */
      
      void real_fmessage(const char *, FILE *, char *, va_list);
      
      /*
       *  Functions
       */
      
      double doubletime(
      )
      {
          struct timespec now;
          static double start = 0.0;
      
          /* clock_gettime() has superceded gettimeofday() */
          clock_gettime(CLOCK_REALTIME, &now);
      
          if (start == 0.0)
              start = now.tv_sec + now.tv_nsec/1000000000.0;
      
          return(now.tv_sec + now.tv_nsec/1000000000.0 - start);
      }
      
      void real_infomsg(
      const char *func,
      char *fmt,
      ...)
      {
          va_list argp;
      
          va_start(argp, fmt);
          real_fmessage(func, stdout, fmt, argp);
          va_end(argp);
      }
      
      void real_errormsg(
      const char *func,
      char *fmt,
      ...)
      {
          va_list argp;
      
          va_start(argp, fmt);
          real_fmessage(func, stdout, fmt, argp);
          va_end(argp);
          exit(EXIT_FAILURE);
      }
      
      void real_finfomsg(
      const char *func,
      FILE *fp,
      char *fmt,
      ...)
      {
          va_list argp;
      
          va_start(argp, fmt);
          real_fmessage(func, fp, fmt, argp);
          va_end(argp);
      }
      
      void real_ferrormsg(
      const char *func,
      FILE *fp,
      char *fmt,
      ...)
      {
          va_list argp;
      
          va_start(argp, fmt);
          real_fmessage(func, fp, fmt, argp);
          va_end(argp);
          exit(EXIT_FAILURE);
      }
      
      void real_fmessage(
      const char *func,
      FILE *fp,
      char *fmt,
      va_list argp)
      {
          fprintf(fp, "%.06lf: %s: ", doubletime(), func);
          vfprintf(fp, fmt, argp);
          fprintf(fp, "\n");
      }
      
    2. Copy and paste this source code into funcs.h:
      #ifndef FUNCS_H         /* reinclusion protection */
      #include <stdio.h>      /* for sprintf(), stderr, ... */
      
      /*
       *  Macros
       */
      
      #define FUNCS_H         /* reinclusion protection */
      #define TRUE  (0 == 0)
      #define FALSE (0 != 0)
      #define infomsg(fmt, ...) real_infomsg(__func__, fmt, ##__VA_ARGS__)
      #define errormsg(fmt, ...) real_errormsg(__func__, fmt, ##__VA_ARGS__)
      #define finfomsg(fp, fmt, ...) real_finfomsg(__func__, fp, fmt, ##__VA_ARGS__)
      #define ferrormsg(fp, fmt, ...) real_ferrormsg(__func__, fp, fmt, ##__VA_ARGS__)
      
      /*
       *  Forward declarations
       */
      
      extern void real_errormsg(const char *, char *, ...);
      extern void real_infomsg(const char *, char *, ...);
      extern void real_ferrormsg(const char *, FILE *, char *, ...);
      extern void real_finfomsg(const char *, FILE *, char *, ...);
      
      #endif                  /* reinclusion protection */
      
    3. Copy and paste this source code into test.c:
      #include <stdio.h>              /* for sprintf(), stderr, ... */
      #include "funcs.h"              /* for infomsg(), errormsg() */
      
      /*
       *  Macros
       */
      
      /*
       *  Type and struct definitions
       */
      
      /*
       *  Global variables
       */
      
      /*
       *  Forward declarations
       */
      
      /*
       *  Functions
       */
      
      int main(
      int argc,
      char *argv[])
      {
          infomsg("forty-two in digits is %d", 42);
          finfomsg(stderr, "this messages goes to stderr");
          return(0);
      }
      
  3. If you have GNU Make installed then:
    1. Compile the test program as follows:
      lagane$ make test
      lagane$
    2. Note that, while following this article, you may ignore the instructions regarding how to compile and simply run:
      lagane$ make
      lagane$
  4. If you do not have GNU Make installed then:
    1. Compile the test program as follows:
      lagane$ gcc -o test funcs.c test.c
      lagane$
  5. Run the test program as follows:
    lagane$ ./test
    0.000000: main: forty-two in digits is 42
    0.000183: main: this messages goes to stderr
    lagane$

     

  6. Verify you get similar output. Note that output messages are prefixed with a relative timestamp and the name of the function which displayed them.

Old-school signal processing in C: handling <CTRL-C>

In order to demonstrate a program being interrupted we need that program to be doing something to be interrupted from. That something should:

  1. preferably be a single system or standard library call that we are familiar with
  2. take some time
  3. be interruptible
  4. not automatically restart on interrupt (so fgets() is not suitable)
  5. not delegate handling <CTRL-C> to another process (so system() is not suitable)

sleep() or fork()+wait() are good options.

  1. Copy and paste this source code into sleep.c:
    #include <errno.h>              /* for errno() */
    #include <signal.h>             /* for signal(), SIG* */
    #include <stdio.h>              /* for sprintf(), stderr, ... */
    #include <stdlib.h>             /* for exit() */
    #include <string.h>             /* for strerror() */
    #include <unistd.h>             /* for sleep() */
    #include "funcs.h"              /* for infomsg(), errormsg() */
    
    /*
     *  Macros
     */
    
    #define A_LONG_TIME             3600
    
    /*
     *  Type and struct definitions
     */
    
    /*
     *  Global variables
     */
    
    /*
     *  Forward declarations
     */
    
    void sigint_handler(int);
    void sigterm_handler(int);
    
    /*
     *  Functions
     */
    
    int main(
    int argc,
    char *argv[])
    {
        int rc;
    
        /*
         *  Initialise.
         */
    
        infomsg("setting up signal handlers ...");
        signal(SIGINT, sigint_handler);
        signal(SIGTERM, sigterm_handler);
    
        /*
         *  Start sleep.
         */
    
        infomsg("before calling sleep()");
        rc = sleep(A_LONG_TIME);
        infomsg("after calling sleep()");
        if (rc != 0)
            infomsg("sleep() returned early due to: %s", strerror(errno));
    
        /*
         *  Clean up and exit.
         */
    
        infomsg("cleaning up and exiting ...");
        signal(SIGINT, SIG_DFL);
        signal(SIGTERM, SIG_DFL);
        return(0);
    }
    
    void sigint_handler(
    int sig)
    {
        printf("\n");       /* prevent next message snuggling ^C */
        infomsg("received SIGINT");
    }
    
    void sigterm_handler(
    int sig)
    {
        infomsg("received SIGTERM");
    }
    
  2. Compile and run the program as follows:
    lagane$ gcc -o sleep funcs.c sleep.c
    lagane$ ./sleep
    0.000000: main: setting up signal handlers ...
    0.000371: main: before calling sleep()
    ^C
    3.932070: sigint_handler: received SIGINT
    3.932087: main: after calling sleep()
    3.932108: main: sleep() returned early due to: Interrupted system call
    3.932113: main: cleaning up and exiting ...
    lagane$ ./sleep &
    [1] 1663
    lagane$ 
    0.000000: main: setting up signal handlers ...
    0.000274: main: before calling sleep()
    
    lagane$ 
    lagane$ kill %1
    lagane$ 
    7.293202: sigterm_handler: received SIGTERM
    7.293218: main: after calling sleep()
    7.293235: main: sleep() returned early due to: Interrupted system call
    7.293238: main: cleaning up and exiting ...
    
    [1]+  Done                    ./sleep
    lagane$ 
    
    

The points I wanted to illustrate with that example are:

  1. pressing <CTRL-C> sends SIGINT; the kill command sends SIGTERM by default
  2. the program did not exit when the signals were received; the signal handler functions sigint_handler() and sigterm_handler() were called and then main() continued and eventually main() exited
  3. this particular program does so little that it has no need to do any cleaning up but it could have done (either directly inside sigint_handler() or sigterm_handler() or in main() after inspecting a global variable that the signal handlers would set to communicate the need for this task to be performed)
  4. signal handlers are process-specific; consequently this program has no need to reinstate the default handlers before it exits, but we do so because:
    1. it’s good practice to keep things symmetrical: tear down what you put up
    2. it allows this source code to be embedded inside other source code without necessitating offloading the responsibility to clean up

Old-school signal processing in C: timeouts

Shortly we will look at a program that waits for one of two different events: a timeout expiring or a child process exiting. But let’s look at just timeouts first.

But what operation do we want to time out? The simplest is to call sleep() and to pretend it represents some other “long”-but-interruptable operation.

  1. Copy and paste this source code into timeout.c:
    #include <errno.h>              /* for errno() */
    #include <signal.h>             /* for signal(), SIG* */
    #include <stdio.h>              /* for sprintf(), stderr, ... */
    #include <stdlib.h>             /* for exit() */
    #include <string.h>             /* for strerror() */
    #include <unistd.h>             /* for sleep() */
    #include "funcs.h"              /* for infomsg(), errormsg() */
    
    /*
     *  Macros
     */
    
    #define A_LONG_TIME             3600
    #define TIMEOUT                 5
    
    /*
     *  Type and struct definitions
     */
    
    /*
     *  Global variables
     */
    
    /*
     *  Forward declarations
     */
    
    void sigalrm_handler(int);
    
    /*
     *  Functions
     */
    
    int main(
    int argc,
    char *argv[])
    {
        int rc;
    
        /*
         *  Initialise.
         */
    
        infomsg("setting up signal handlers ...");
        signal(SIGALRM, sigalrm_handler);
    
        /*
         *  Schedule timeout alarm.
         */
    
        infomsg("scheduling timeout alarm ...");
        alarm(TIMEOUT);
    
        /*
         *  Start sleep.
         */
    
        infomsg("before calling sleep()");
        rc = sleep(A_LONG_TIME);
        infomsg("after calling sleep()");
        if (rc != 0)
            infomsg("sleep() returned early due to: %s", strerror(errno));
    
        /*
         *  Clean up and exit.
         */
    
        infomsg("cleaning up and exiting ...");
        signal(SIGALRM, SIG_DFL);
        return(0);
    }
    
    void sigalrm_handler(
    int sig)
    {
        infomsg("received SIGALRM");
    }
    
  2. Compile and run the program as follows:
    lagane$ gcc -o timeout timeout.c funcs.o
    lagane$ ./timeout
    0.000000: main: setting up signal handlers ...
    0.000366: main: scheduling timeout alarm ...
    0.000535: main: before calling sleep()
    5.000690: sigalrm_handler: received SIGALRM
    5.000731: main: after calling sleep()
    5.000757: main: sleep() returned early due to: Interrupted system call
    5.000761: main: cleaning up and exiting ...
    lagane$ 
    

The points I wanted to illustrate with that example are:

  1. alarm() scheduled SIGALRM to be delivered to the process itself 5s later
  2. sleep(3600) slept for only 5s
  3. we were able to determine that sleep() returned early and why it did so

Old-school signal processing in C: rewriting system()

Shortly we will look at a program that waits for one of two different events: a timeout expiring or a child process exiting. But let’s look just at monitoring a child process first.

We will do this in a few steps. Firstly a version without signal handlers.

Copy and paste this source code into system1.c:

#include <errno.h>              /* for errno() */
#include <linux/limits.h>       /* for ARG_MAX */
#include <signal.h>             /* for signal(), SIG* */
#include <stdio.h>              /* for sprintf(), stderr, ... */
#include <stdlib.h>             /* for exit() */
#include <string.h>             /* for strerror() */
#include <sys/types.h>          /* for wait() */
#include <sys/wait.h>           /* for wait() */
#include <unistd.h>             /* for sleep(), fork() and execlp() */
#include "funcs.h"              /* for infomsg(), errormsg() */

/*
 *  Macros
 */

#define A_LONG_TIME             3600
#define CHILD_RUN_TIME          10
#define WAIT_INSTEAD_OF_SLEEP   TRUE

/*
 *  Type and struct definitions
 */

/*
 *  Global variables
 */

/*
 *  Forward declarations
 */

pid_t start_child_sleep(void);

/*
 *  Functions
 */

int main(
int argc,
char *argv[])
{
    int wstatus;
    pid_t pid;

    /*
     *  Initialise.
     */

    /*
     *  Start child.
     */

    infomsg("parent starting one child ...");
    pid = start_child_sleep();

    /*
     *  Monitor child
     */

    infomsg("parent sees child has pid %d and waits for it ...", pid);
#if WAIT_INSTEAD_OF_SLEEP
    pid = wait(&wstatus);
#else
    sleep(A_LONG_TIME);
#endif

    /*
     *  Clean up and exit.
     */

    infomsg("parent cleaning up and exiting ...");
    return(0);
}

pid_t start_child_sleep(
)
{
    pid_t pid;
    char buf[ARG_MAX];

    if ((pid=fork()) < 0)
        errormsg("fork() failed: %s", strerror(errno));
    else if (pid > 0)
        return(pid);

    /*
     *  Only the child gets here
     */

    sprintf(buf, "echo \"child running ...\";"
                 "sleep %d;"
                 "echo \"child exiting ...\"",
                 CHILD_RUN_TIME);
    execlp("/bin/sh", "sh", "-c", buf, (char *) NULL);
    errormsg("exec() failed: %s", strerror(errno));
}

Compile and run the program as follows:

lagane$ gcc -o system1 system1.c funcs.o
lagane$ ./system1
0.000000: main: parent starting one child ...
0.000418: main: parent sees child has pid 3778 and waits for it ...
child running ...
child exiting ...
10.002626: main: parent cleaning up and exiting ...
lagane$ 

The points I wanted to illustrate with that example are:

  1. wait() will wait until a child process exits (or it is interrupted by the arrival of a signal)
  2. but this blocks main() from doing any other tasks concurrently

In system1.c, comment out pid = wait(&wstatus); and uncomment sleep(A_LONG_TIME); simply by changing the definition of WAIT_INSTEAD_OF_SLEEP from this:

#define WAIT_INSTEAD_OF_SLEEP TRUE

to this:

#define WAIT_INSTEAD_OF_SLEEP FALSE

and then recompile and run the program as follows:

lagane$ gcc -o system1 system1.c funcs.o
lagane$ ./system1 &
[2] 4714
lagane$
0.000000: main: parent starting one child ...
0.000464: main: parent sees child has pid 4715 and waits for it ...
child running ...
child exiting ...

lagane$ 
lagane$ ps -lp 4715
F S   UID   PID  PPID  C PRI  NI ADDR SZ WCHAN  TTY          TIME CMD
0 Z  1000  4715  4714  0  80   0 -     0 -      pts/7    00:00:00 sh 
lagane$ kill 4714
lagane$ 
[2]-  Terminated              ./system1
lagane$ 

The points I wanted to illustrate with that example are:

  1. ps shows the child process’s state is Z (zombie)
  2. a zombie process is a process that has exited but some cleanup remains to be done (e.g. removing its entry from the process table)
  3. not calling wait() results in a zombie process therefore we must call wait()
  4. calling wait() to clear up a zombie process’s leftovers is called reaping

However, there is no need to call wait() as soon as we launch the child process; instead we can delay calling wait() until we know that a child process has already exited and is reapable. So then what should main() do in the mean time? Shortly we will look at a main() doing something more complicated but, for now, let’s just make it loop until the handler has set a global variable to indicate that it has called wait().

Copy and paste this source code into system2.c:

#include <errno.h>              /* for errno() */
#include <linux/limits.h>       /* for ARG_MAX */
#include <signal.h>             /* for signal(), SIG* */
#include <stdio.h>              /* for sprintf(), stderr, ... */
#include <stdlib.h>             /* for exit() */
#include <string.h>             /* for strerror() */
#include <sys/types.h>          /* for wait() */
#include <sys/wait.h>           /* for wait() */
#include <unistd.h>             /* for sleep(), fork() and execlp() */
#include "funcs.h"              /* for infomsg(), errormsg() */

/*
 *  Macros
 */

#define CHILD_RUN_TIME          10

/*
 *  Type and struct definitions
 */

/*
 *  Global variables
 */

int child_reaped = FALSE;

/*
 *  Forward declarations
 */

pid_t start_child_sleep(void);
void sigchld_handler(int);

/*
 *  Functions
 */

int main(
int argc,
char *argv[])
{
    pid_t pid;

    /*
     *  Initialise.
     */

    infomsg("parent setting up signal handlers ...");
    signal(SIGCHLD, sigchld_handler);

    /*
     *  Start child.
     */

    infomsg("parent starting one child ...");
    pid = start_child_sleep();

    /*
     *  Main monitoring loop
     */

    infomsg("parent sees child has pid %d and loops checking flag ...", pid);
    while (TRUE) {

        /*
         *  Exit if no running children.
         */

        if (child_reaped) {
            infomsg("parent sees child_reaped flag and so stops looping ...");
            break;
        }

        /*
         *  Let a little time pass before we reinspect the situation.
         */

        sleep(1);
    }

    /*
     *  Clean up and exit.
     */

    infomsg("parent cleaning up and exiting ...");
    signal(SIGCHLD, SIG_DFL);
    return(0);
}

pid_t start_child_sleep(
)
{
    pid_t pid;
    char buf[ARG_MAX];

    if ((pid=fork()) < 0)
        errormsg("fork() failed: %s", strerror(errno));
    else if (pid > 0)
        return(pid);

    /*
     *  Only the child gets here
     */

    sprintf(buf, "echo \"child running ...\";"
                 "sleep %d;"
                 "echo \"child exiting ...\"",
                 CHILD_RUN_TIME);
    execlp("/bin/sh", "sh", "-c", buf, (char *) NULL);
    errormsg("exec() failed: %s", strerror(errno));
}

void sigchld_handler(
int sig)
{
    int wstatus;
    pid_t pid;

    infomsg("parent received SIGCHLD; reaping and setting child_reaped flag ...");
    pid = wait(&wstatus);
    child_reaped = TRUE;
}

Compile and run the program as follows:

lagane$ gcc -o system2 system2.c funcs.o
lagane$ ./system2
0.000000: main: parent setting up signal handlers ...
0.000319: main: parent starting one child ...
0.000576: main: parent sees child has pid 6034 and loops checking flag ...
child running ...
child exiting ...
10.003294: sigchld_handler: parent received SIGCHLD; reaping and setting child_reaped flag ...
10.003330: main: parent sees child_reaped flag and so stops looping ...
10.003335: main: parent cleaning up and exiting ...
lagane$ 

The points I wanted to illustrate with that example are:

  1. signal(SIGCHLD, sigchld_handler) means “call sigchld_handler() whenever the OS informs us that a child process has just exited”
  2. because we only call wait() when we know that the child process has just exited and is reapable, then wait() returns immediately

Old-school signal processing in C: system() with timeouts

Now we combine both system2.c and timeout.c to monitor a child process and to kill it if it runs for longer than a specified timeout.

Copy and paste this source code into system-with-timeout.c:

#include <errno.h>              /* for errno() */
#include <linux/limits.h>       /* for ARG_MAX */
#include <signal.h>             /* for signal(), SIG* */
#include <stdio.h>              /* for sprintf(), stderr, ... */
#include <stdlib.h>             /* for exit() */
#include <string.h>             /* for strerror() */
#include <sys/types.h>          /* for wait() */
#include <sys/wait.h>           /* for wait() */
#include <unistd.h>             /* for sleep(), fork() and execlp() */
#include "funcs.h"              /* for infomsg(), errormsg() */

/*
 *  Macros
 */

#define TIMEOUT                 15
#define A_LONG_TIME             3600
#define CHILD_RUN_TIME          10

/*
 *  Type and struct definitions
 */

/*
 *  Global variables
 */

int child_reaped = FALSE;
int timed_out = FALSE;

/*
 *  Forward declarations
 */

pid_t start_child_sleep(void);
void sigchld_handler(int);
void sigalrm_handler(int);

/*
 *  Functions
 */

int main(
int argc,
char *argv[])
{
    pid_t pid;

    /*
     *  Initialise.
     */

    infomsg("parent setting up signal handlers ...");
    signal(SIGCHLD, sigchld_handler);
    signal(SIGALRM, sigalrm_handler);

    /*
     *  Start child.
     */

    infomsg("parent starting one child ...");
    pid = start_child_sleep();

    /*
     *  Schedule timeout alarm.
     */

    infomsg("parent scheduling timeout alarm ...");
    alarm(TIMEOUT);

    /*
     *  Main monitoring loop
     */

    infomsg("parent entering monitoring loop ...");
    while (TRUE) {

        /*
         *  Exit if no running children.
         */

        if (child_reaped) {
            if (!timed_out)
                alarm(0);
            infomsg("parent sees child_reaped flag and so stops looping ...");
            break;
        }

        /*
         *  Kill child if it has reached its timeout time.
         */

        if (timed_out) {
            infomsg("parent sees timed_out flag and so kills child ...");
            kill(pid, SIGTERM);
        }

        /*
         *  Sleep until next signal arrives
         */

        infomsg("parent sleeping until signal arrives ...");
        sleep(A_LONG_TIME);

        /*
         *  If SIGCHLD arrived before SIGALRM then the alarm is still
         *  pending. Cancel it.
         */

        infomsg("parent cancelling alarm ...");
        alarm(0);
    }

    /*
     *  Clean up and exit.
     */

    infomsg("parent cleaning up and exiting ...");
    signal(SIGCHLD, SIG_DFL);
    signal(SIGALRM, SIG_DFL);
    return(0);
}

pid_t start_child_sleep(
)
{
    pid_t pid;
    char buf[ARG_MAX];

    if ((pid=fork()) < 0)
        errormsg("fork() failed: %s", strerror(errno));
    else if (pid > 0)
        return(pid);

    /*
     *  Only the child gets here
     */

    sprintf(buf, "echo \"child running ...\";"
                 "sleep %d;"
                 "echo \"child exiting ...\"",
                 CHILD_RUN_TIME);
    execlp("/bin/sh", "sh", "-c", buf, (char *) NULL);
    errormsg("exec() failed: %s", strerror(errno));
}

void sigchld_handler(
int sig)
{
    int wstatus;
    pid_t pid;

    infomsg("parent received SIGCHLD; reaping and setting child_reaped flag ...");
    pid = wait(&wstatus);
    child_reaped = TRUE;
}

void sigalrm_handler(
int sig)
{
    infomsg("parent received SIGALRM; setting timed_out flag ...");
    timed_out = TRUE;
}

Note that child process will run for 10s; that’s this bit:

#define CHILD_RUN_TIME  10
...
sprintf(buf, "echo \"child running ...\";"
             "sleep %d;"
             "echo \"child exiting ...\"",
             CHILD_RUN_TIME);
execlp("/bin/sh", "sh", "-c", buf, (char *) NULL);

and the timeout is 15s; that’s this bit:

#define TIMEOUT         15
...
alarm(TIMEOUT);

Compile and run the program as follows:

lagane$ gcc -o system-with-timeout system-with-timeout.c funcs.o
lagane$ ./system-with-timeout
0.000000: main: parent setting up signal handlers ...
0.000284: main: parent starting one child ...
0.000484: main: parent scheduling timeout alarm ...
0.000620: main: parent entering monitoring loop ...
0.000726: main: parent sleeping until signal arrives ...
child running ...
child exiting ...
10.003069: sigchld_handler: parent received SIGCHLD; reaping and setting child_reaped flag ...
10.003111: main: parent cancelling alarm ...
10.003119: main: parent sees child_reaped flag and so stops looping ...
10.003124: main: parent cleaning up and exiting ...
lagane$ 

The points I wanted to illustrate with that example are:

  1. the child process exits normally
  2. the SIGCHLD caused by the child process exiting arrived before the SIGALRM caused by the timeout would have done
  3. when the child process exits then the alarm, which is still scheduled, is no longer needed and we cancel it by calling alarm(0)
  4. the monitoring loop has to do something; we make it call sleep(A_LONG_TIME) but what we really mean is “do nothing until a signal (SIGCHLD or SIGALRM) arrives”; remember: sleep() is interrupted if a signal arrives

Change the child run time to 20s by setting:

#define CHILD_RUN_TIME 20

and recompile and run the program follows:

lagane$ gcc -o system-with-timeout system-with-timeout.c funcs.o
lagane$ ./system-with-timeout
0.000000: main: parent setting up signal handlers ...
0.000325: main: parent starting one child ...
0.000614: main: parent scheduling timeout alarm ...
0.000820: main: parent entering monitoring loop ...
0.000987: main: parent sleeping until signal arrives ...
child running ...
15.001000: sigalrm_handler: parent received SIGALRM; setting timed_out flag ...
15.001051: main: parent cancelling alarm ...
15.001058: main: parent sees timed_out flag and so kills child ...
15.001073: main: parent sleeping until signal arrives ...
15.001476: sigchld_handler: parent received SIGCHLD; reaping and setting child_reaped flag ...
15.001503: main: parent cancelling alarm ...
15.001509: main: parent sees child_reaped flag and so stops looping ...
15.001513: main: parent cleaning up and exiting ...
lagane$ 

The points I wanted to illustrate with that example are:

  1. this time the SIGALRM caused by the timeout arrived before the SIGCHLD caused by the child process exiting
  2. so the program kills the child process
  3. just because we kill the child process does not relieve us of the responsibility to wait() for the child process
  4. furthermore, that applies regardless of whether we “ask” the child process to commit suicide with kill(pid, SIGTERM) or we really kill it with kill(pid, SIGKILL).

Old-school signal processing in C: system() with timeouts and multi-child support

The previous example used two global variables to record the state of one child process (has it timed out? has it been reaped?). If we want to launch more child processes in parallel then two global variables are not going to be enough. Instead we define a structure and then allocate an array of that structure to store information about some arbitrary number of child processes:

#define MAX_CHILDREN    1000
...
struct child {
    pid_t pid;
    time_t start;
};
...
struct child children[MAX_CHILDREN];

Note that instead of recording whether a process has timed out, we could just record its start time. This would mean:

  • we would determine if a process has timed out by comparing the current time with the process’s start time
  • this would allow us to start different child processes at different times and we would still be able to determine which had timed out and which had not timed out
  • when we determined that a child process had timed out then we could send it SIGTERM and set its start time to zero; this way we would know not to repeatedly signal it

A complication is that we can’t schedule one alarm per running child process because there is only one alarm clock. So we need to work out the interval after which the next-to-time-out child process will time out and that is when we set the alarm for.

Another complication is that alarm() takes an integer argument so if we want it to schedule the SIGALRM signal to arrive in 0.5s then we have a problem. The solution presented here is that we work entirely with integer times (except when displaying timestamps for debugging).

Copy and paste this source code into system-with-timeout-and-multichild-support1.c:

#include <errno.h>              /* for errno() */
#include <linux/limits.h>       /* for ARG_MAX */
#include <signal.h>             /* for signal(), SIG* */
#include <stdio.h>              /* for sprintf(), stderr, ... */
#include <stdlib.h>             /* for exit() */
#include <string.h>             /* for strerror() */
#include <sys/types.h>          /* for wait() */
#include <sys/wait.h>           /* for wait() */
#include <time.h>               /* for time() */
#include <unistd.h>             /* for sleep(), fork() and execlp() */
#include "funcs.h"              /* for infomsg(), errormsg() */

/*
 *  Macros
 */

#define MAX_CHILDREN            1000
#define CHILDREN                5
#define TIMEOUT                 30
#define A_LONG_TIME             3600
#define CHILD_RUN_TIME(n)       n
#define WAITPID_LOOP            FALSE

/*
 *  Type and struct definitions
 */

struct child {
    pid_t  pid;
    time_t start;
};

/*
 *  Global variables
 */

struct child children[MAX_CHILDREN];

/*
 *  Forward declarations
 */

pid_t start_child_sleep(int);
void sigchld_handler(int);
void sigalrm_handler(int);

/*
 *  Functions
 */

int main(
int argc,
char *argv[])
{
    int i;
    int running_children_count, next_timeout, killed_something;
    time_t now;

    /*
     *  Initialise.
     */

    infomsg("parent initialising children status table ...");
    for (i=0; i<MAX_CHILDREN; i++)
        children[i].pid = 0;

    infomsg("parent setting up signal handlers ...");
    signal(SIGCHLD, sigchld_handler);
    signal(SIGALRM, sigalrm_handler);

    /*
     *  Start children.
     */

    infomsg("parent starting %d children ...", CHILDREN);
    for (i=0; i<CHILDREN; i++)
        start_child_sleep(CHILD_RUN_TIME(i));

    /*
     *  Main monitoring loop
     */

    infomsg("parent entering monitoring loop ...");
    while (TRUE) {
        now = time(NULL);

        /*
         *  Exit if no running children.
         */

        infomsg("parent checking for running children ...");
        running_children_count = 0;
        for (i=0; i<MAX_CHILDREN; i++)
            if (children[i].pid != 0)
                running_children_count++;
        infomsg("parent sees %d children still running", running_children_count);
        if (running_children_count == 0)
            break;

        /*
         *  Kill any children that have reached their timeout time and not
         *  been killed already.
         */

        killed_something = FALSE;
        for (i=0; i<MAX_CHILDREN; i++)
            if (children[i].pid != 0 && children[i].start != 0 &&
                    now >= children[i].start+TIMEOUT) {
                kill(children[i].pid, SIGTERM);
                children[i].start = 0;
                killed_something = TRUE;
            }

        /*
         *  Slight optimisation: if something did reach its timeout and got
         *  killed then skip to reassessing if this program can exit.
         */

        if (killed_something)
            continue;

        /*
         *  Schedule timeout alarm.
         */

        infomsg("parent scheduling timeout alarm ...");
        next_timeout = 0;
        for (i=0; i<MAX_CHILDREN; i++)
            if (children[i].pid != 0 && children[i].start != 0)
                if (next_timeout == 0)
                    next_timeout = children[i].start+TIMEOUT - now;
                else if (children[i].start+TIMEOUT - now < next_timeout)
                    next_timeout = children[i].start+TIMEOUT - now;
        if (next_timeout >= 1) {
            infomsg("parent scheduling alarm for %ds ...", next_timeout);
            alarm(next_timeout);
        }

        /*
         *  Sleep until next signal arrives
         *  (unless timeout is so close that we didn't schedule an alarm
         *  so sleeping might not be interrupted).
         */

        infomsg("parent sleeping until signal arrives ...");
        if (next_timeout >= 1)
            sleep(A_LONG_TIME);

        /*
         *  If SIGCHLD arrived before SIGALRM then the alarm is still
         *  pending. Cancel it.
         */

        infomsg("parent cancelling alarm ...");
        alarm(0);
    }

    /*
     *  Clean up and exit.
     */

    infomsg("parent cleaning up and exiting ...");
    signal(SIGCHLD, SIG_DFL);
    signal(SIGALRM, SIG_DFL);
    return(0);
}

pid_t start_child_sleep(
int period)
{
    pid_t pid;
    char buf[ARG_MAX];
    int i;

    /*
     *  Find an empty slot to store info about the process we're
     *  about launch.
     */

    for (i=0; i<MAX_CHILDREN; i++)
        if (children[i].pid == 0)
            break;
    if (i == MAX_CHILDREN)
        errormsg("start_child_sleep: unable to find a free slot");

    /*
     *  Launch a child process and note its pid and start time
     *  in the empty slot.
     */

    sprintf(buf, "echo \"child $$ started ...\";"
                 "sleep %d;"
                 "echo \"child $$ exiting ...\"",
                 period);
    if ((pid=fork()) < 0)
        errormsg("fork() failed: %s", strerror(errno));
    else if (pid > 0) {
        children[i].pid = pid;
        children[i].start = time(NULL);
        return(pid);
    }

    /*
     *  Only the child gets here
     */

    execlp("/bin/sh", "sh", "-c", buf, (char *) NULL);
    errormsg("exec() failed: %s", strerror(errno));
}

void sigchld_handler(
int sig)
{
    int wstatus;
    pid_t pid;
    int i;

#if WAITPID_LOOP
    while ((pid=waitpid(-1, &wstatus, WNOHANG)) > 0) {
        infomsg("parent received SIGCHLD; reaping and clearing child data ...");
        for (i=0; i<MAX_CHILDREN; i++)
            if (children[i].pid == pid) {
                children[i].pid = 0;
                children[i].start = 0;
                break;
            }
    }
#else
    infomsg("parent received SIGCHLD; reaping and clearing child data ...");
    pid = wait(&wstatus);
    for (i=0; i<MAX_CHILDREN; i++)
       if (children[i].pid == pid) {
           children[i].pid = 0;
           children[i].start = 0;
           break;
       }
#endif
}

void sigalrm_handler(
int sig)
{
   infomsg("parent received SIGALRM");
}

Compile and run the program as follows:

lagane$ gcc -o system-with-timeout-and-multichild-support1 \
        system-with-timeout-and-multichild-support1.c funcs.o
lagane$ ./system-with-timeout-and-multichild-support1
0.000000: main: parent initialising children status table ...
0.000377: main: parent setting up signal handlers ...
0.000543: main: parent starting 5 children ...
0.001276: main: parent entering monitoring loop ...
0.001543: main: parent checking for running children ...
0.001717: main: parent sees 5 children still running
0.001884: main: parent scheduling timeout alarm ...
0.002092: main: parent scheduling alarm for 30s ...
0.002285: main: parent sleeping until signal arrives ...
child 8123 started ...
child 8122 started ...
child 8120 started ...
child 8121 started ...
child 8119 started ...
child 8119 exiting ...
0.010015: sigchld_handler: parent received SIGCHLD; reaping and clearing child data ...
0.010048: main: parent cancelling alarm ...
0.010055: main: parent checking for running children ...
0.010063: main: parent sees 4 children still running
0.010071: main: parent scheduling timeout alarm ...
0.010079: main: parent scheduling alarm for 30s ...
0.010084: main: parent sleeping until signal arrives ...
child 8120 exiting ...
1.009713: sigchld_handler: parent received SIGCHLD; reaping and clearing child data ...
1.009738: main: parent cancelling alarm ...
1.009745: main: parent checking for running children ...
1.009756: main: parent sees 3 children still running
1.009766: main: parent scheduling timeout alarm ...
1.009775: main: parent scheduling alarm for 29s ...
1.009781: main: parent sleeping until signal arrives ...
child 8121 exiting ...
2.009266: sigchld_handler: parent received SIGCHLD; reaping and clearing child data ...
2.009291: main: parent cancelling alarm ...
2.009331: main: parent checking for running children ...
2.009343: main: parent sees 2 children still running
2.009353: main: parent scheduling timeout alarm ...
2.009362: main: parent scheduling alarm for 28s ...
2.009369: main: parent sleeping until signal arrives ...
child 8122 exiting ...
3.008095: sigchld_handler: parent received SIGCHLD; reaping and clearing child data ...
3.008121: main: parent cancelling alarm ...
3.008128: main: parent checking for running children ...
3.008138: main: parent sees 1 children still running
3.008147: main: parent scheduling timeout alarm ...
3.008157: main: parent scheduling alarm for 27s ...
3.008163: main: parent sleeping until signal arrives ...
child 8123 exiting ...
4.008600: sigchld_handler: parent received SIGCHLD; reaping and clearing child data ...
4.008626: main: parent cancelling alarm ...
4.008633: main: parent checking for running children ...
4.008644: main: parent sees 0 children still running
4.008648: main: parent cleaning up and exiting ...
lagane$ 

The points I wanted to illustrate with that example are:

  1. note that the child process’s run time is a number of seconds equal to the child process’s index; this is implemented with the code:
    #define CHILD_RUN_TIME(n) n
  2. so each of the 5 child processes (indexed 0, 1, 2, 3, 4) runs for a different amount of time (0s, 1s, 2s, 3s, 4s)
  3. so each of the SIGCHLD signals arrive at different times (~0s, ~1s, ~2s, ~3s, ~4s)
  4. with a ~1s interval between the signals there is ample time to process each signal before the next signal arrives

Old-school signal processing in C: but now it starts to go wrong

In system-with-timeout-and-multichild-support1.c, change this line:

#define CHILD_RUN_TIME(n) n

to this:

#define CHILD_RUN_TIME(n) 1

i.e. all 5 child processes should exit simultaneously after 1s.

Recompile and run the program as follows:

lagane$ gcc -o system-with-timeout-and-multichild-support1 \
        system-with-timeout-and-multichild-support1.c funcs.o
lagane$ ./system-with-timeout-and-multichild-support1
...

If we are lucky then the program exits ~1s later. If we are unlucky then it will do something like this:

lagane$ ./system-with-timeout-and-multichild-support1
0.000000: main: parent initialising children status table ...
0.000330: main: parent setting up signal handlers ...
0.000500: main: parent starting 5 children ...
0.001201: main: parent entering monitoring loop ...
0.001403: main: parent checking for running children ...
0.001595: main: parent sees 5 children still running
0.001759: main: parent scheduling timeout alarm ...
0.001951: main: parent scheduling alarm for 30s ...
0.002126: main: parent sleeping until signal arrives ...
child 9015 started ...
child 9014 started ...
child 9013 started ...
child 9012 started ...
child 9011 started ...
child 9013 exiting ...
1.010212: sigchld_handler: parent received SIGCHLD; reaping and clearing child data ...
1.010250: main: parent cancelling alarm ...
1.010257: main: parent checking for running children ...
1.010267: main: parent sees 4 children still running
1.010277: main: parent scheduling timeout alarm ...
1.010286: main: parent scheduling alarm for 29s ...
1.010292: main: parent sleeping until signal arrives ...
child 9014 exiting ...
child 9012 exiting ...
child 9011 exiting ...
child 9015 exiting ...
1.011580: sigchld_handler: parent received SIGCHLD; reaping and clearing child data ...
1.011597: main: parent cancelling alarm ...
1.011603: main: parent checking for running children ...
1.011612: main: parent sees 3 children still running
1.011622: main: parent scheduling timeout alarm ...
1.011631: main: parent scheduling alarm for 29s ...
1.011637: main: parent sleeping until signal arrives ...
<several seconds go by with no output>
^C
lagane$

The points I wanted to illustrate with that example are:

  1. all the child processes exited ~1s after the program started (we know this because there are five “child ... exiting” messages)
  2. but the program thinks that some children are still running (we know this because of the “parent sees 3 children still running” message)
  3. so something has gone wrong!

Old-school signal processing in C: but we can work around that

The problem is that the child processes exit so close to each other that the code is the OS decides to interrupt the code only once but the OS expects the single call to the signal handler to handle the multiple pending SIGCHLD signals.

The workaround is pretty simple: a single call to the signal handler should reap all reapable child processes. wait() is not sophisticated enough to support doing this, but waitpid() is.

In system-with-timeout-and-multichild-support1.c, replace sigchld_handler()‘s logic with a loop by changing this:

#define WAITPID_LOOP FALSE

to this:

#define WAITPID_LOOP TRUE

Recompile and run the program as above. This time it should always exit after ~1s.

Old-school signal processing in C: but increase the concurrency and it goes wrong again

So now we’re going to:

  1. increase the number of children from 5 to 500
  2. keep them all exiting after 1s
  3. add a SIGUSR1-triggered dump of the child processes’ statuses so that we can inspect the statuses if things look like they’ve gone wrong
  4. increase the timeout to 5 minutes in order to give us a little more time to inspect things (but note the child processes’ execution time is much shorter than this timeout so this change is not functionally significant)

Copy and paste this source code into system-with-timeout-and-multichild-support2.c:

#include <errno.h>              /* for errno() */
#include <linux/limits.h>       /* for ARG_MAX */
#include <signal.h>             /* for signal(), SIG* */
#include <stdio.h>              /* for sprintf(), stderr, ... */
#include <stdlib.h>             /* for exit() */
#include <string.h>             /* for strerror() */
#include <sys/types.h>          /* for wait() */
#include <sys/wait.h>           /* for wait() */
#include <time.h>               /* for time() */
#include <unistd.h>             /* for sleep(), fork() and execlp() */
#include "funcs.h"              /* for infomsg(), errormsg() */

/*
 *  Macros
 */

#define MAX_CHILDREN            1000
#define CHILDREN                500
#define TIMEOUT                 300
#define A_LONG_TIME             3600
#define CHILD_RUN_TIME(n)       1

/*
 *  Type and struct definitions
 */

struct child {
    pid_t  pid;
    time_t start;
};

/*
 *  Global variables
 */

struct child children[MAX_CHILDREN];

/*
 *  Forward declarations
 */

pid_t start_child_sleep(int);
void sigchld_handler(int);
void sigalrm_handler(int);
void sigusr1_handler(int);

/*
 *  Functions
 */

int main(
int argc,
char *argv[])
{
    int i;
    int running_children_count, next_timeout, killed_something;
    time_t now;

    /*
     *  Initialise.
     */

    infomsg("parent initialising children status table ...");
    for (i=0; i<MAX_CHILDREN; i++)
        children[i].pid = 0;

    infomsg("parent setting up signal handlers ...");
    signal(SIGCHLD, sigchld_handler);
    signal(SIGALRM, sigalrm_handler);
    signal(SIGUSR1, sigusr1_handler);

    /*
     *  Start children.
     */

    infomsg("parent starting %d children ...", CHILDREN);
    for (i=0; i<CHILDREN; i++)
        start_child_sleep(CHILD_RUN_TIME(i));

    /*
     *  Main monitoring loop
     */

    infomsg("parent entering monitoring loop ...");
    while (TRUE) {
        now = time(NULL);

        /*
         *  Exit if no running children.
         */

        infomsg("parent checking for running children ...");
        running_children_count = 0;
        for (i=0; i<MAX_CHILDREN; i++)
            if (children[i].pid != 0)
                running_children_count++;
        infomsg("parent sees %d children still running", running_children_count);
        if (running_children_count == 0)
            break;

        /*
         *  Kill any children that have reached their timeout time and not
         *  been killed already.
         */

        killed_something = FALSE;
        for (i=0; i<MAX_CHILDREN; i++)
            if (children[i].pid != 0 && children[i].start != 0 &&
                    now >= children[i].start+TIMEOUT) {
                kill(children[i].pid, SIGTERM);
                children[i].start = 0;
                killed_something = TRUE;
            }

        /*
         *  Slight optimisation: if something did reach its timeout and got
         *  killed then skip to reassessing if this program can exit.
         */

        if (killed_something)
            continue;

        /*
         *  Schedule timeout alarm of next-to-timeout child.
         */

        infomsg("parent scheduling timeout alarm ...");
        next_timeout = 0;
        for (i=0; i<MAX_CHILDREN; i++)
            if (children[i].pid != 0 && children[i].start != 0)
                if (next_timeout == 0)
                    next_timeout = children[i].start+TIMEOUT - now;
                else if (children[i].start+TIMEOUT - now < next_timeout)
                    next_timeout = children[i].start+TIMEOUT - now;
        if (next_timeout >= 1) {
            infomsg("parent scheduling alarm for %ds ...", next_timeout);
            alarm(next_timeout);
        }

        /*
         *  Sleep until next signal arrives
         *  (unless timeout is so close that we didn't schedule an alarm
         *  so sleeping might not be interrupted).
         */

        infomsg("parent sleeping until signal arrives ...");
        if (next_timeout >= 1)
            sleep(A_LONG_TIME);

        /*
         *  If SIGCHLD arrived before SIGALRM then the alarm is still
         *  pending. Cancel it.
         */

        infomsg("parent cancelling alarm ...");
        alarm(0);
    }

    /*
     *  Clean up and exit.
     */

    infomsg("parent cleaning up and exiting ...");
    signal(SIGCHLD, SIG_DFL);
    signal(SIGALRM, SIG_DFL);
    signal(SIGUSR1, SIG_DFL);
    return(0);
}

pid_t start_child_sleep(
int period)
{
    pid_t pid;
    char buf[ARG_MAX];
    int i;

    /*
     *  Find an empty slot to store info about the process we're
     *  about launch.
     */

    for (i=0; i<MAX_CHILDREN; i++)
        if (children[i].pid == 0)
            break;
    if (i == MAX_CHILDREN)
        errormsg("start_child_sleep: unable to find a free slot");

    /*
     *  Launch a child process and note its pid and start time
     *  in the empty slot.
     */

    sprintf(buf, "echo \"child $$ started ...\";"
                 "sleep %d;"
                 "echo \"child $$ exiting ...\"",
                 period);
    if ((pid=fork()) < 0)
        errormsg("fork() failed: %s", strerror(errno));
    else if (pid > 0) {
        children[i].pid = pid;
        children[i].start = time(NULL);
        return(pid);
    }

    /*
     *  Only the child gets here
     */

    execlp("/bin/sh", "sh", "-c", buf, (char *) NULL);
    errormsg("exec() failed: %s", strerror(errno));
}

void sigchld_handler(
int sig)
{
    int wstatus;
    pid_t pid;
    int i;

    while ((pid=waitpid(-1, &wstatus, WNOHANG)) > 0) {
        infomsg("parent received SIGCHLD; reaping and clearing child data ...");
        for (i=0; i<MAX_CHILDREN; i++)
            if (children[i].pid == pid) {
                children[i].pid = 0;
                children[i].start = 0;
                break;
            }
    }
}

void sigalrm_handler(
int sig)
{
   infomsg("parent received SIGALRM");
}

void sigusr1_handler(
int sig)
{
   int i;

   finfomsg(stderr, "parent received SIGUSR1");
   for (i=0; i<MAX_CHILDREN; i++)
       if (children[i].pid != 0)
           finfomsg(stderr, "slot:%04d; pid:%05d, start=%ld", 
                    i, children[i].pid, children[i].start);
}

Compile and run the program as follows:

lagane$ gcc -o system-with-timeout-and-multichild-support2 \
        system-with-timeout-and-multichild-support2.c funcs.o
lagane$ ./system-with-timeout-and-multichild-support2 >/dev/null
...

If we are lucky, which we usually are, then the program exits ~1s later. If we are unlucky then it does not. We will now run it repeatedly until we are unlucky using a wrapper script.

Copy and paste this source code into hanger.sh:

#!/bin/bash

[ $# = 1 -a -x "$1" ] || { echo "Usage: ${0##*/} <script>" >&2; exit 1; }
SCRIPT=$1

PATH=$PATH:.
I=0
while :; do
    ((I++))
    $SCRIPT >/dev/null &
    PID=$!
    sleep 3
    if [ -d /proc/$PID ]; then
        echo "$SCRIPT (pid $PID) hung on ${I}th attempt; children dump follows ..."
        kill -USR1 $PID
        break
    fi
done

Compile and run the program as follows:

lagane$ cat hanger.sh >hanger
lagane$ chmod a+x hanger
lagane$ hanger system-with-timeout-and-multichild-support2

Soon I got this:

...
system-with-timeout-and-multichild-support-support2 (pid 24829) hung on 4th attempt; children dump follows ...
2.998520: sigusr1_handler: parent received SIGUSR1
2.998706: sigusr1_handler: slot:481; pid:07244, start=1632241800
2.998710: sigusr1_handler: slot:482; pid:07245, start=1632241800
2.998714: sigusr1_handler: slot:483; pid:07246, start=1632241800
2.998719: sigusr1_handler: slot:484; pid:07247, start=1632241800
2.998723: sigusr1_handler: slot:485; pid:07248, start=1632241800
2.998727: sigusr1_handler: slot:486; pid:07249, start=1632241800
2.998731: sigusr1_handler: slot:487; pid:07250, start=1632241800
2.998736: sigusr1_handler: slot:488; pid:07251, start=1632241800
2.998740: sigusr1_handler: slot:489; pid:07252, start=1632241800
2.998744: sigusr1_handler: slot:490; pid:07253, start=1632241800
2.998749: sigusr1_handler: slot:491; pid:07254, start=1632241800
2.998753: sigusr1_handler: slot:492; pid:07255, start=1632241800
2.998757: sigusr1_handler: slot:493; pid:07256, start=1632241800
2.998761: sigusr1_handler: slot:494; pid:07257, start=1632241800
2.998766: sigusr1_handler: slot:495; pid:07258, start=1632241800
2.998770: sigusr1_handler: slot:496; pid:07259, start=1632241800
2.998775: sigusr1_handler: slot:497; pid:07260, start=1632241800
2.998779: sigusr1_handler: slot:498; pid:07261, start=1632241800
2.998784: sigusr1_handler: slot:499; pid:07262, start=1632241800
lagane$

On another occassion I got this:

...
system-with-timeout-and-multichild-support-support2 (pid 21121) hung on 24th attempt; children dump follows ...
2.998824: sigusr1_handler: parent received SIGUSR1 
lagane$

The points I wanted to illustrate with these examples are:

  1. the reality is that ~1s after starting the program all child processes have exited
  2. in the first output above children[] does not reflect this
  3. in the second output above children[] does reflect this but the program still failed to exit at the right time
  4. in both outputs sending   SIGUSR1 causes the program to exit!
  5. both outputs are caused by a race condition bug

The race condition needs a bit of explanation. Let’s imagine that the execution of the program has proceeded to the point where only one child process – let’s call it child#499 – is still running and it is about to exit; main() is calling sleep(A_LONG_TIME)

  1. child #499 exits
  2. the OS sends SIGCHLD to system-with-timeout-and-multichild-support2 to inform it that child#499 has exited
  3. main()s call to sleep(A_LONG_TIME) is interrupted by the arrival of SIGCHLD
  4. asynchronously, sigchld_handler() is called to handle the signal
  5. meanwhile, main() continues!
  6. main() calls alarm(0) to clear the pending alarm and jumps to the top of the loop
  7. sigchld_handler() calls waitpid() to reap the just-exited child process and to determine its PID – let’s call it PID#499
  8. main() starts searching through children[], to see if any child processes are still marked as running
  9. sigchld_handler() starts searching through children[], looking for the entry pertaining to PID#499 (it doesn’t know that it’s in children[499].pid yet)
  10. main() finds one entry regarding a running pid in children[499].pid
  11. sigchld_handler() finds PID#499 in children[499].pid
  12. sigchld_handler() sets children[499].pid = 0 to indicate the child#499 has exited
  13. main() calls sleep(A_LONG_TIME) even though all children have exited and the children[] array indicates that!

It should be clear that the problem is due to main() and sigchld_handler() accessing children[] concurrently.

Old-school signal processing in C: but, again, we can work around that

Regarding race conditions, Wikipedia says:

Critical race conditions often happen when the processes or threads depend on some shared state. Operations upon shared states are done in critical sections that must be mutually exclusive. Failure to obey this rule can corrupt the shared state.

We could implement mutual exclusion using a semaphore or other atomic locking mechanism:

  • sigchld_handler() takes the semaphore or waits to do so if it is not immediately available, then it updates children[] and then it releases the semaphore
  • main() either: (a) takes the semaphore or waits to do so if it is not immediately available, then it updates children[] and then it releases the semaphore or (b) if the semaphore is not immediately available it does other tasks instead but then does not go back to sleep, looping round until the semaphore is immediately available

But this starts to get ugly: global variables are required so that the signal handler knows what semaphore to take.

Old-school signal processing in C: code that doesn’t like being interrupted

Besides, there are also other problems with doing things the old way.

Copy and paste this source code into tcp-server1.c:

#include <arpa/inet.h>          /* for inet_ntoa() */
#include <errno.h>              /* for errno() */
#include <netinet/in.h>
#include <signal.h>             /* for signal(), SIG* */
#include <stdio.h>              /* for sprintf(), stderr, ... */
#include <stdlib.h>             /* for exit() */
#include <string.h>             /* for strerror() */
#include <sys/types.h>          /* for wait() */
#include <sys/wait.h>           /* for wait() */
#include <time.h>               /* for time() */
#include <unistd.h>             /* for sleep(), fork() and execlp() */
#include <sys/socket.h>
#include "funcs.h"              /* for infomsg(), errormsg() */

/*
 *  Macros
 */

#define MAX_CHILDREN            1000
#define MYPORT                  2345
#define MAXPENDING              10
#define CHILD_RUN_TIME          20

/*
 *  Type and struct definitions
 */

struct child {
    pid_t  pid;
    time_t start;
};

/*
 *  Global variables
 */

struct child children[MAX_CHILDREN];

/*
 *  Forward declarations
 */

pid_t start_child_server(int);
void sigchld_handler(int);

/*
 *  Functions
 */

int main(
int argc,
char *argv[])
{
    int i;
    int listening_sfd, toclient_sfd;
    struct sockaddr_in listening_spars;
    struct sockaddr_in toclient_spars;
    int actual_toclient_spars_size;
    int sockoptval;

    /*
     *  Initialise.
     */

    infomsg("parent setting up listening socket ...");
    /* create addressless IPv4 TCP socket */
    if ((listening_sfd = socket(AF_INET, SOCK_STREAM, 0)) == -1)
        errormsg("socket() failed");
    /* ensure bind() doesn't find old stale socket */
    sockoptval = 1;
    setsockopt(listening_sfd, SOL_SOCKET, SO_REUSEADDR,
               &sockoptval, sizeof(sockoptval));
    /* define other socket parameters */
    memset(&listening_spars, 0, sizeof(struct sockaddr_in));
    listening_spars.sin_family = AF_INET;         /* IPv4 */
    listening_spars.sin_port = htons(MYPORT);     /* port */
    listening_spars.sin_addr.s_addr = INADDR_ANY; /* listen on all NICs */
    /* apply socket parameters */
    if (bind(listening_sfd, (struct sockaddr *) &listening_spars,
             sizeof(struct sockaddr)) == -1)
        errormsg("bind() failed");
    /* mark socket for listening */
    if (listen(listening_sfd, MAXPENDING) == -1)
        errormsg("listen() failed");

    infomsg("parent initialising children status table ...");
    for (i=0; i<MAX_CHILDREN; i++)
        children[i].pid = 0;

    infomsg("parent setting up signal handlers ...");
    signal(SIGCHLD, sigchld_handler);

    /*
     *  Main monitoring loop
     */

    infomsg("parent entering monitoring loop ...");
    while (TRUE) {
        /* wait for incoming connection, move it to a new socket */
        infomsg("parent awaiting incoming connection ...");
        actual_toclient_spars_size = sizeof(struct sockaddr_in);
        if ((toclient_sfd = accept(listening_sfd, 
                                   (struct sockaddr *) &toclient_spars,
                                   &actual_toclient_spars_size)) == -1) {
            infomsg("accept() failed");
            continue;
        }

        /*
         *  Start child server to talk to client.
         */

        infomsg("parent starting one child ...");
        start_child_server(toclient_sfd);
        /* child server uses toclient_sfd, not parent server */
        close(toclient_sfd);
    }

    /*
     *  Clean up and exit.
     */

    infomsg("parent cleaning up and exiting ...");
    signal(SIGCHLD, SIG_DFL);
    return(0);
}

pid_t start_child_server(
int toclient_sfd)
{
    pid_t pid;
    int i;
    const char *msg = "this is a message from the server to the client\n";

    /*
     *  Find an empty slot to store info about the process we're
     *  about launch.
     */

    for (i=0; i<MAX_CHILDREN; i++)
        if (children[i].pid == 0)
            break;
    if (i == MAX_CHILDREN)
        errormsg("start_child_sleep: unable to find a free slot");

    /*
     *  Launch a child server and note its pid and start time
     *  in the empty slot.
     */

    if ((pid=fork()) < 0)
        errormsg("fork() failed: %s", strerror(errno));
    else if (pid > 0) {
        children[i].pid = pid;
        children[i].start = time(NULL);
        return(pid);
    }

    /*
     *  Only the child gets here
     */

    infomsg("child is pid %d", (int) getpid());
    infomsg("child sending message to client ...");
    if (send(toclient_sfd, msg, strlen(msg), 0) == -1)
        errormsg("send");
    infomsg("child sleeping a bit ...");
    sleep(CHILD_RUN_TIME);
    infomsg("child exiting ...");
    close(toclient_sfd);
    exit(EXIT_SUCCESS);
}

void sigchld_handler(
int sig)
{
    int wstatus;
    pid_t pid;
    int i;

    while ((pid=waitpid(-1, &wstatus, WNOHANG)) > 0) {
        infomsg("parent received SIGCHLD; reaping and clearing child data ...");
        for (i=0; i<MAX_CHILDREN; i++)
            if (children[i].pid == pid) {
                children[i].pid = 0;
                children[i].start = 0;
                break;
            }
    }
}

Compile and run the program as follows:

lagane$ gcc -o tcp-server1 tcp-server1.c funcs.o
lagane$ ./tcp-server1 &
[1] 31254
lagane$
0.000000: main: parent setting up listening socket ...
0.000410: main: parent initialising children status table ...
0.000591: main: parent setting up signal handlers ...
0.000753: main: parent entering monitoring loop ...
0.000909: main: parent awaiting incoming connection ...

lagane$ 
lagane$ nc localhost 2345 < /dev/null
30.412644: main: parent starting one child ...
30.412805: main: parent awaiting incoming connection ...
30.413026: start_child_server: child is pid 31262
30.413167: start_child_server: child sending message to client ...
this is a message from the server to the client
30.413399: start_child_server: child sleeping a bit ...
50.413532: start_child_server: child exiting ...
50.413946: sigchld_handler: parent received SIGCHLD; reaping and clearing child data ...
lagane$ 

The points I wanted to illustrate with that example are:

  1. accept() automatically restarts if a signal arrives; we know this because after the SIGCHLD arrived we did not see the message accept() failed

Here we change from a server that forks a child process on an incoming TCP connection to a server that forks child processes every 10s. Copy and paste this source code into interval-server.c:

#include <errno.h>              /* for errno() */
#include <linux/limits.h>       /* for ARG_MAX */
#include <signal.h>             /* for signal(), SIG* */
#include <stdio.h>              /* for sprintf(), stderr, ... */
#include <stdlib.h>             /* for exit() */
#include <string.h>             /* for strerror() */
#include <sys/types.h>          /* for wait() */
#include <sys/wait.h>           /* for wait() */
#include <time.h>               /* for time() */
#include <unistd.h>             /* for sleep(), fork() and execlp() */
#include "funcs.h"              /* for infomsg(), errormsg() */

/*
 *  Macros
 */

#define MAX_CHILDREN            1000
#define CHILDREN                5
#define INTER_CHILD_INTERVAL    10
#define CHILD_RUN_TIME(n)       (((n+getpid())*7)%19) /* stupid pseudo RNG */

/*
 *  Type and struct definitions
 */

struct child {
    pid_t  pid;
    time_t start;
};

/*
 *  Global variables
 */

struct child children[MAX_CHILDREN];

/*
 *  Forward declarations
 */

pid_t start_child_sleep(int);
void sigchld_handler(int);

/*
 *  Functions
 */

int main(
int argc,
char *argv[])
{
    int i, j;
    int running_children_count;
    time_t now;

    /*
     *  Initialise.
     */

    infomsg("parent initialising children status table ...");
    for (i=0; i<MAX_CHILDREN; i++)
        children[i].pid = 0;

    infomsg("parent setting up signal handlers ...");
    signal(SIGCHLD, sigchld_handler);

    /*
     *  Main monitoring loop
     */

    infomsg("parent entering monitoring loop ...");
    j = 0;
    while (TRUE) {
        now = time(NULL);

        /*
         *  Exit if no running child processes *and*
         *  we've already started desired number
         */

        if (j < CHILDREN)
            infomsg("parent sees some children not started yet");
        else {
            infomsg("parent checking for running children ...");
            running_children_count = 0;
            for (i=0; i<MAX_CHILDREN; i++)
                if (children[i].pid != 0)
                    running_children_count++;
            infomsg("parent sees %d children still running", running_children_count);
            if (running_children_count == 0 && j == CHILDREN)
                break;
        }

        /*
         *  Every INTER_CHILD_INTERVAL seconds start another child process.
         */

        if (j<CHILDREN) {
            infomsg("parent starting one child ...");
            start_child_sleep(CHILD_RUN_TIME(j));
            j++;
            infomsg("parent sleeping %ds ...", INTER_CHILD_INTERVAL);
            sleep(INTER_CHILD_INTERVAL);
        }
    }

    /*
     *  Clean up and exit.
     */

    infomsg("parent cleaning up and exiting ...");
    signal(SIGCHLD, SIG_DFL);
    return(0);
}

pid_t start_child_sleep(
int period)
{
    pid_t pid;
    int i;
    char buf[ARG_MAX];

    /*
     *  Find an empty slot to store info about the process we're
     *  about launch.
     */

    for (i=0; i<MAX_CHILDREN; i++)
        if (children[i].pid == 0)
            break;
    if (i == MAX_CHILDREN)
        errormsg("start_child_sleep: unable to find a free slot");

    /*
     *  Launch a child process and note its pid and start time
     *  in the empty slot.
     */

    if ((pid=fork()) < 0)
        errormsg("fork() failed: %s", strerror(errno));
    else if (pid > 0) {
        children[i].pid = pid;
        children[i].start = time(NULL);
        return(pid);
    }

    /*
     *  Only the child gets here
     */

    sprintf(buf, "echo \"child $$ started ...\";"
                 "sleep %d;"
                 "echo \"child $$ exiting ...\"",
                 period);
    infomsg("child running for %ds ...", period);
    execlp("/bin/sh", "sh", "-c", buf, (char *) NULL);
    errormsg("exec() failed: %s", strerror(errno));
}

void sigchld_handler(
int sig)
{
    int wstatus;
    pid_t pid;
    int i;

    while ((pid=waitpid(-1, &wstatus, WNOHANG)) > 0) {
        infomsg("parent received SIGCHLD; reaping and clearing child data ...");
        for (i=0; i<MAX_CHILDREN; i++)
            if (children[i].pid == pid) {
                children[i].pid = 0;
                children[i].start = 0;
                break;
            }
    }
}

Compile and run the program as follows:

lagane$ gcc -o interval-server interval-server.c funcs.o
lagane$ ./interval-server 
0.000000: main: parent initialising children status table ...
0.000353: main: parent setting up signal handlers ...
0.000516: main: parent entering monitoring loop ...
0.000675: main: parent sees some children not started yet
0.000871: main: parent starting one child ...
0.001149: main: parent sleeping 10s ...
0.001432: start_child_sleep: child running for 15s ...
child 1477 started ...
10.001590: main: parent sees some children not started yet
10.001641: main: parent starting one child ...
10.001800: main: parent sleeping 10s ...
10.002173: start_child_sleep: child running for 3s ...
child 1480 started ...
child 1480 exiting ...
13.005047: sigchld_handler: parent received SIGCHLD; reaping and clearing child data ...
13.005072: main: parent sees some children not started yet
13.005078: main: parent starting one child ...
13.005189: main: parent sleeping 10s ...
13.005585: start_child_sleep: child running for 10s ...
child 1482 started ...
child 1477 exiting ...
15.006176: sigchld_handler: parent received SIGCHLD; reaping and clearing child data ...
15.006203: main: parent sees some children not started yet
15.006208: main: parent starting one child ...
15.006318: main: parent sleeping 10s ...
15.006743: start_child_sleep: child running for 17s ...
child 1484 started ...
^C

The points I wanted to illustrate with that example are:

  1. this is a contrived and unrealistic example, but I wanted to make interval-server.c simple and as close as possible to tcp-server1.c
  2. it does not take long before the exiting of some child processes is affecting the regular forking of other child processes
  3. this is because sleep() does not automatically restart if a signal arrives

Old-school signal processing in C: but, yet again, we can work around that

We could work around that problem with something like:

start_sleep_time = time(NULL);
while (TRUE) {
    /* how long to sleep? */
    desired_sleep_time = start_sleep_time + INTER_CHILD_INTERVAL - time(NULL);
    /* if  slept full amount then no need to sleep more */
    if (sleep(desired_sleep_time) == desired_sleep_time)
        break;
    /* if didn't sleep full amount but not due to signal then exit sleep loop */
    if (errno != EINTR)
        break;
}

Obviously, if we were to fork a child process in response to an event more complicated that just the completion of a time interval, for example by calling select() to monitor several file handles, which would be a less contrived and more realistic example, but which would make the code much more complicated, then the interruption might become harder to work around.

Reliable signal processing in C: system-with-timeout-and-multichild-support2.c ported to the sigset ecosystem

As Wikipedia says:

The sigaction() function provides an interface for reliable signals in replacement of the unreliable and deprecated signal() function.

Here is system-with-timeout-and-multichild-support2.c updated to use sigaction() instead of signal() and sigsuspend() instead of sleep(A_LONG_TIME). Copy and paste this source code into system-with-timeout-and-multichild-support3.c:

#include <errno.h>              /* for errno() */
#include <linux/limits.h>       /* for ARG_MAX */
#include <signal.h>             /* for signal(), SIG* */
#include <stdio.h>              /* for sprintf(), stderr, ... */
#include <stdlib.h>             /* for exit() */
#include <string.h>             /* for strerror() */
#include <sys/types.h>          /* for wait() */
#include <sys/wait.h>           /* for wait() */
#include <time.h>               /* for time() */
#include <unistd.h>             /* for sleep(), fork() and execlp() */
#include "funcs.h"              /* for infomsg(), errormsg() */

/*
 *  Macros
 */

#define MAX_CHILDREN            1000
#define CHILDREN                500
#define TIMEOUT                 300
#define A_LONG_TIME             3600
#define CHILD_RUN_TIME(n)       1
#define HANDLE_MULTIPLE_PENDING_SIGCHLDS_IN_ONE_GO TRUE

/*
 *  Type and struct definitions
 */

struct child {
    pid_t  pid;
    time_t start;
};

/*
 *  Global variables
 */

struct child children[MAX_CHILDREN];

/*
 *  Forward declarations
 */

pid_t start_child_sleep(int);
void sigchld_handler(int);
void sigalrm_handler(int);
void sigusr1_handler(int);

/*
 *  Functions
 */

int main(
int argc,
char *argv[])
{
    int i;
    int running_children_count, next_timeout, killed_something;
    time_t now;
    sigset_t sigset, old_sigset, suspend_sigset;
    struct sigaction act, old_chld_act, old_alrm_act, old_usr1_act;

    /*
     *  Initialise.
     */

    infomsg("parent initialising children status table ...");
    for (i=0; i<MAX_CHILDREN; i++)
        children[i].pid = 0;

    infomsg("parent setting up signal handlers ...");
    /* cause delivery of SIGCHLD, SIGARLM and SIGUSR1 to be delayed until we call sigsuspend() */
    sigemptyset(&sigset);
    sigaddset(&sigset, SIGCHLD);
    sigaddset(&sigset, SIGALRM);
    sigaddset(&sigset, SIGUSR1);
    sigprocmask(SIG_BLOCK, &sigset, &old_sigset);
    /* later we need list of signals blocked before sigprocmask() call,
       but *excluding* SIGCHLD, etc. */
    memcpy(&suspend_sigset, &old_sigset, sizeof(sigset_t));
    sigdelset(&suspend_sigset, SIGCHLD);
    sigdelset(&suspend_sigset, SIGALRM);
    sigdelset(&suspend_sigset, SIGUSR1);
    /* on SIGCHLD call sigchld_handler() and delay delivery of other signals while in it */
    sigemptyset(&sigset);
    sigaddset(&sigset, SIGALRM);
    sigaddset(&sigset, SIGUSR1);
    memset(&act, 0, sizeof(struct sigaction));
    act.sa_mask = sigset; 
    act.sa_flags = 0;  
    act.sa_handler = sigchld_handler;
    sigaction(SIGCHLD, &act, &old_chld_act);
    /* on SIGARLM call sigalrm_handler() and delay delivery of other signals while in it */
    sigemptyset(&sigset);
    sigaddset(&sigset, SIGCHLD);
    sigaddset(&sigset, SIGUSR1);
    memset(&act, 0, sizeof(struct sigaction));
    act.sa_mask = sigset;
    act.sa_flags = 0; 
    act.sa_handler = sigalrm_handler;
    sigaction(SIGALRM, &act, &old_alrm_act);
    /* on SIGUSR1 call sigusr1_handler() and delay delivery of other signals while in it */
    sigemptyset(&sigset);
    sigaddset(&sigset, SIGCHLD);
    sigaddset(&sigset, SIGALRM);
    memset(&act, 0, sizeof(struct sigaction));
    act.sa_mask = sigset;
    act.sa_flags = SA_RESTART;  
    act.sa_handler = sigusr1_handler;
    sigaction(SIGUSR1, &act, &old_usr1_act);

    /*
     *  Start children.
     */

    infomsg("parent starting %d children ...", CHILDREN);
    for (i=0; i<CHILDREN; i++)
        start_child_sleep(CHILD_RUN_TIME(i));

    /*
     *  Main monitoring loop
     */

    infomsg("parent entering monitoring loop ...");
    while (TRUE) {
        now = time(NULL);

        /*
         *  Exit if no running children.
         */

        infomsg("parent checking for running children ...");
        running_children_count = 0;
        for (i=0; i<MAX_CHILDREN; i++)
            if (children[i].pid != 0)
                running_children_count++;
        infomsg("parent sees %d children still running", running_children_count);
        if (running_children_count == 0)
            break;

        /*
         *  Kill any children that have reached their timeout time and not
         *  been killed already.
         */

        killed_something = FALSE;
        for (i=0; i<MAX_CHILDREN; i++)
            if (children[i].pid != 0 && children[i].start != 0 &&
                    now >= children[i].start+TIMEOUT) {
                kill(children[i].pid, SIGTERM);
                children[i].start = 0;
                killed_something = TRUE;
            }

        /*
         *  Slight optimisation: if something did reach its timeout and got
         *  killed then skip to reassessing if this program can exit.
         */

        if (killed_something)
            continue;

        /*
         *  Schedule timeout alarm of next-to-timeout child.
         */

        infomsg("parent scheduling timeout alarm ...");
        next_timeout = 0;
        for (i=0; i<MAX_CHILDREN; i++)
            if (children[i].pid != 0 && children[i].start != 0)
                if (next_timeout == 0)
                    next_timeout = children[i].start+TIMEOUT - now;
                else if (children[i].start+TIMEOUT - now < next_timeout)
                    next_timeout = children[i].start+TIMEOUT - now;
        if (next_timeout >= 1) {
            infomsg("parent scheduling alarm for %ds ...", next_timeout);
            alarm(next_timeout);
        }

        /*
         *  If there are dispatched-but-not-yet-delivered signals then
         *  handle them. If there are none then wait for one.
         */

        infomsg("parent handling any pending signals ...");
        if (next_timeout >= 1)
            sigsuspend(&suspend_sigset);

        /*
         *  If SIGCHLD arrived before SIGALRM then the alarm is still
         *  pending. Cancel it.
         */

        infomsg("parent cancelling alarm ...");
        alarm(0);
    }

    /*
     *  Clean up and exit.
     */

    infomsg("parent cleaning up and exiting ...");
    sigaction(SIGUSR1, &old_usr1_act, NULL);
    sigaction(SIGALRM, &old_alrm_act, NULL);
    sigaction(SIGCHLD, &old_chld_act, NULL);
    sigprocmask(SIG_SETMASK, &old_sigset, NULL);
    return(0);
}

pid_t start_child_sleep(
int period)
{
    pid_t pid;
    char buf[ARG_MAX];
    int i;

    /*
     *  Find an empty slot to store info about the process we're
     *  about launch.
     */

    for (i=0; i<MAX_CHILDREN; i++)
        if (children[i].pid == 0)
            break;
    if (i == MAX_CHILDREN)
        errormsg("start_child_sleep: unable to find a free slot");

    /*
     *  Launch a child process and note its pid and start time
     *  in the empty slot.
     */

    sprintf(buf, "echo \"child $$ started ...\";"
                 "sleep %d;"
                 "echo \"child $$ exiting ...\"",
                 period);
    if ((pid=fork()) < 0)
        errormsg("fork() failed: %s", strerror(errno));
    else if (pid > 0) {
        children[i].pid = pid;
        children[i].start = time(NULL);
        return(pid);
    }

    /*
     *  Only the child gets here
     */

    execlp("/bin/sh", "sh", "-c", buf, (char *) NULL);
    errormsg("exec() failed: %s", strerror(errno));
}

void sigchld_handler(
int sig)
{
    int wstatus;
    pid_t pid;
    int i;

    while ((pid=waitpid(-1, &wstatus, WNOHANG)) > 0) {
        infomsg("parent received SIGCHLD; reaping and clearing child data ...");
        for (i=0; i<MAX_CHILDREN; i++)
            if (children[i].pid == pid) {
                children[i].pid = 0;
                children[i].start = 0;
                break;
            }
#if ! HANDLE_MULTIPLE_PENDING_SIGCHLDS_IN_ONE_GO
        break;
#endif
    }
}

void sigalrm_handler(
int sig)
{
   infomsg("parent received SIGALRM");
}

void sigusr1_handler(
int sig)
{
   int i;

   finfomsg(stderr, "parent received SIGUSR1");
   for (i=0; i<MAX_CHILDREN; i++)
       if (children[i].pid != 0)
           finfomsg(stderr, "slot:%04d; pid:%05d, start=%ld", 
                    i, children[i].pid, children[i].start);
}

Compile and run the program as follows:

lagane$ gcc -o system-with-timeout-and-multichild-support3 \
        system-with-timeout-and-multichild-support3.c funcs.o
lagane$
lagane$ hanger system-with-timeout-and-multichild-support3
<after-an-hour-still-no-output>
^C
lagane$

The points I wanted to illustrate with that example are:

  1. system-with-timeout-and-multichild-support3 does not lose track of its child processes; it does not hang (after an hour I got bored and pressed ^C)
  2. the signal handler still needs to loop to handle multiple pending signals (try setting #define HANDLE_MULTIPLE_PENDING_SIGCHLDS_IN_ONE_GO FALSE to see this)
  3. in several places in the source code, we need to specify multiple signals in a single call in order to record:
    • which signals to block (see the two sigprocmask() calls)
    • which signals were blocked prior to the sigprocmask() call (see the first sigprocmask() call)
    • which signals to temporarily unblock (see the sigsuspend() call)
    • which signals to block during the execution of the signal handler (see sa_mask in the act variable, which is passed to sigaction())

    signal sets allow us to do this (see the manual page for sigsetops())

  4. documention frequently refers to calling sigprocmask() to block signals during critical code but we can flip this around:
    • near the start of a program call sigprocmask() to block signals
    • near the end of a program call sigprocmask() to unblock signals
    • in between and at a time that we decide we’re ready to handle incoming signals call sigsuspend() to handle recently-dispatched-but-currently-blocked signals
  5. symmetry: tear down only what you put up; don’t assume that a particular signal was not already blocked prior to the first call to sigprocmask() by unblocking it in the second call to sigprocmask()
  6. using sigaction() and sigprocmask() is complicated; even O’Reilly got it wrong in the first edition of their Perl Cookbook (that call to sigprocmask(SIG_UNBLOCK, $old_sigset) should have been either sigprocmask(SIG_UNBLOCK, $sigset) or better still sigprocmask(SIG_SETMASK, $old_sigset))

system-with-timeout-and-multichild-support3.c program can be improved a bit:

  • if the signal handler checks the signal number that is  passed to it then we need define only one signal handler function and, although that really just means moving code around, it does mean that establishing the handler will be something we do once rather than one per signal type
  • if we comment out all calls to printf() and echo in the monitoring loop then it will run faster and we will stand an even better chance of getting it to hang

However, before we do that …

Reliable signal processing in C: some comments about SA_RESTART and sigprocmask()

The tcp-server1.c program above used signal() to establish the signal handler. The signal() man page states:

the signal() wrapper function …  calls sigaction(2) using flags that supply BSD semantics

The BSD semantics are equivalent to calling sigaction(2) with the following flags:

sa.sa_flags = SA_RESTART;

and the sigaction() man page explains:

SA_RESTART [provides] behavior compatible with BSD signal semantics by making certain system calls restartable across signals

The accept() man page states:

EINTR [indicates] the system call was interrupted by a signal that was caught before a valid connection arrived

all of which suggests that accept() might be a system call that is affected by SA_RESTART (either when explicitly specified in a call to sigaction() or implicitly specified in a call to signal()).

In tcp-server1, and as noted above, the exiting of the child server process and the consequence dispatch of a SIGCHLD signal did not cause the accept() call to return prematurely, so either accept() restarted automatically or accept() ignores or blocks the signal.

In order to determine which of these happened, we can clone tcp-server1.c to tcp-server2.c and replace calls to signal() with calls to sigaction() but leave sa_flags set to 0.

Copy and paste this source code into tcp-server2.c:

#include <arpa/inet.h>          /* for inet_ntoa() */
#include <errno.h>              /* for errno() */
#include <netinet/in.h>
#include <signal.h>             /* for signal(), SIG* */
#include <stdio.h>              /* for sprintf(), stderr, ... */
#include <stdlib.h>             /* for exit() */
#include <string.h>             /* for strerror() */
#include <sys/types.h>          /* for wait() */
#include <sys/wait.h>           /* for wait() */
#include <time.h>               /* for time() */
#include <unistd.h>             /* for sleep(), fork() and execlp() */
#include <sys/socket.h>
#include "funcs.h"              /* for infomsg(), errormsg() */

/*
 *  Macros
 */

#define MAX_CHILDREN            1000
#define MYPORT                  2345
#define MAXPENDING              10
#define CHILD_RUN_TIME          20
#define USE_SA_RESTART          FALSE
#define BLOCK_SIGCHLD           FALSE

/*
 *  Type and struct definitions
 */

struct child {
    pid_t  pid;
    time_t start;
};

/*
 *  Global variables
 */

struct child children[MAX_CHILDREN];

/*
 *  Forward declarations
 */

pid_t start_child_server(int);
void handler(int);

/*
 *  Functions
 */

int main(
int argc,
char *argv[])
{
    int i;
    int listening_sfd, toclient_sfd;
    struct sockaddr_in listening_spars;
    struct sockaddr_in toclient_spars;
    int actual_toclient_spars_size;
    int sockoptval;
    sigset_t sigset, old_sigset, suspend_sigset;
    struct sigaction act, old_chld_act, old_alrm_act, old_usr1_act;

    /*
     *  Initialise.
     */

    infomsg("parent setting up listening socket ...");
    /* create addressless IPv4 TCP socket */
    if ((listening_sfd = socket(AF_INET, SOCK_STREAM, 0)) == -1)
        errormsg("socket() failed");
    /* ensure bind() doesn't find old stale socket */
    sockoptval = 1;
    setsockopt(listening_sfd, SOL_SOCKET, SO_REUSEADDR,
               &sockoptval, sizeof(sockoptval));
    /* define other socket parameters */
    memset(&listening_spars, 0, sizeof(struct sockaddr_in));
    listening_spars.sin_family = AF_INET;         /* IPv4 */
    listening_spars.sin_port = htons(MYPORT);     /* port */
    listening_spars.sin_addr.s_addr = INADDR_ANY; /* listen on all NICs */
    /* apply socket parameters */
    if (bind(listening_sfd, (struct sockaddr *) &listening_spars,
             sizeof(struct sockaddr)) == -1)
        errormsg("bind() failed");
    /* mark socket for listening */
    if (listen(listening_sfd, MAXPENDING) == -1)
        errormsg("listen() failed");

    infomsg("parent initialising children status table ...");
    for (i=0; i<MAX_CHILDREN; i++)
        children[i].pid = 0;

    infomsg("parent setting up signal handlers ...");
#if BLOCK_SIGCHLD
    sigemptyset(&sigset);
    sigaddset(&sigset, SIGCHLD);
    /* cause delivery of SIGCHLD, etc to be delayed until we call sigsuspend() */
    sigprocmask(SIG_BLOCK, &sigset, &old_sigset);
    /* later we need list of signals blocked before that sigprocmask() call,
       but *excluding* SIGCHLD, etc. */
    memcpy(&suspend_sigset, &old_sigset, sizeof(sigset_t));
    sigdelset(&suspend_sigset, SIGCHLD);
#endif
    /* establish signals handlers */
    memset(&act, 0, sizeof(struct sigaction));
    act.sa_mask = sigset;
#if USE_SA_RESTART
    act.sa_flags = SA_RESTART;
#else
    act.sa_flags = 0;
#endif
    act.sa_handler = handler;
    sigaction(SIGCHLD, &act, &old_chld_act);

    /*
     *  Main monitoring loop
     */

    infomsg("parent entering monitoring loop ...");
    while (TRUE) {
        /* wait for incoming connection, move it to a new socket */
        infomsg("parent awaiting incoming connection ...");
        actual_toclient_spars_size = sizeof(struct sockaddr_in);
        if ((toclient_sfd = accept(listening_sfd, 
                                   (struct sockaddr *) &toclient_spars,
                                   &actual_toclient_spars_size)) == -1) {
            infomsg("accept() failed");
            continue;
        }

        /*
         *  Start child server to talk to client.
         */

        infomsg("parent starting one child ...");
        start_child_server(toclient_sfd);
        /* child server uses toclient_sfd, not parent server */
        close(toclient_sfd);
    }

    /*
     *  Clean up and exit.
     */

    infomsg("parent cleaning up and exiting ...");
    sigaction(SIGCHLD, &old_chld_act, NULL);
#if BLOCK_SIGCHLD
    sigprocmask(SIG_SETMASK, &old_sigset, NULL);
#endif
    return(0);
}

pid_t start_child_server(
int toclient_sfd)
{
    pid_t pid;
    int i;
    const char *msg = "this is a message from the server to the client\n";

    /*
     *  Find an empty slot to store info about the process we're
     *  about launch.
     */

    for (i=0; i<MAX_CHILDREN; i++)
        if (children[i].pid == 0)
            break;
    if (i == MAX_CHILDREN)
        errormsg("start_child_sleep: unable to find a free slot");

    /*
     *  Launch a child server and note its pid and start time
     *  in the empty slot.
     */

    if ((pid=fork()) < 0)
        errormsg("fork() failed: %s", strerror(errno));
    else if (pid > 0) {
        children[i].pid = pid;
        children[i].start = time(NULL);
        return(pid);
    }

    /*
     *  Only the child gets here
     */

    infomsg("child is pid %d", (int) getpid());
    infomsg("child sending message to client ...");
    if (send(toclient_sfd, msg, strlen(msg), 0) == -1)
        errormsg("send");
    infomsg("child sleeping a bit ...");
    sleep(CHILD_RUN_TIME);
    infomsg("child exiting ...");
    close(toclient_sfd);
    exit(EXIT_SUCCESS);
}

void handler(
int sig)
{
    int wstatus;
    pid_t pid;
    int i;

    switch (sig) {
        case SIGCHLD: while ((pid=waitpid(-1, &wstatus, WNOHANG)) > 0) {
                          infomsg("parent received SIGCHLD; reaping and clearing child data ...");
                          for (i=0; i<MAX_CHILDREN; i++)
                              if (children[i].pid == pid) {
                                  children[i].pid = 0;
                                  children[i].start = 0;
                                  break;
                              }
                      }
                      break;
        default:      errormsg("parent received unexpected signal %d", sig);
                      break;
    }
}

Note the #define macros at the top of the source

#define USE_SA_RESTART FALSE
#define BLOCK_SIGCHLD FALSE

Compile and run the program as follows:

lagane$ gcc -o tcp-server2 tcp-server2.c funcs.o
lagane$ ./tcp-server2 &
[1] 18925
lagane$
0.000000: main: parent setting up listening socket ...
0.000325: main: parent initialising children status table ...
0.000489: main: parent setting up signal handlers ...
0.000600: main: parent entering monitoring loop ...
0.000704: main: parent awaiting incoming connection ...
lagane$ 
lagane$ 
lagane$ nc localhost 2345 < /dev/null
6.496349: main: parent starting one child ...
6.496527: main: parent awaiting incoming connection ...
6.496778: start_child_server: child is pid 18927
6.496988: start_child_server: child sending message to client ...
this is a message from the server to the client
6.497326: start_child_server: child sleeping a bit ...
26.497686: start_child_server: child exiting ...
26.498167: handler: parent received SIGCHLD; reaping and clearing child data ...
26.498188: main: accept() failed
26.498193: main: parent awaiting incoming connection ...
lagane$ 

The points I wanted to illustrate with that example are:

  1. we now see accept() failing
  2. recall from the history lesson that  signal() is equivalent to a call to sigaction() with SA_RESTART enabled, implying that tcp-server1’s call to accept() had automatic restart enabled whereas tcp-server2’s call to accept() has automatic restart disabled: this confirms that accept() is a system call influenced by SA_RESTART

Now change the #define macros at the top of tcp-server2.c source file to:

#define USE_SA_RESTART TRUE
#define BLOCK_SIGCHLD FALSE

Recompile and run the program as follows:

lagane$ gcc -o tcp-server2 tcp-server2.c funcs.o
lagane$ ./tcp-server2 &
[2] 20810
lagane$
0.000000: main: parent setting up listening socket ...
0.000407: main: parent initialising children status table ...
0.000632: main: parent setting up signal handlers ...
0.000820: main: parent entering monitoring loop ...
0.000982: main: parent awaiting incoming connection ...
lagane$ 
lagane$ 
lagane$ nc localhost 2345 < /dev/null
5.470687: main: parent starting one child ...
5.470838: main: parent awaiting incoming connection ...
5.471082: start_child_server: child is pid 20812
5.471288: start_child_server: child sending message to client ...
this is a message from the server to the client
5.471636: start_child_server: child sleeping a bit ...
25.472001: start_child_server: child exiting ...
25.472451: handler: parent received SIGCHLD; reaping and clearing child data ...
lagane$ 

The points I wanted to illustrate with that example are:

  1. we can see accept() is internally restarted again

Now disable SA_RESTART but enable the blocking of SIGCHLD in line with the changes that were made between system-with-timeout-and-multichild-support2.c and system-with-timeout-and-multichild-support3.c by setting this in tcp-server2.c:

#define USE_SA_RESTART FALSE
#define BLOCK_SIGCHLD TRUE

Recompile and run the program as follows:

lagane$ gcc -o tcp-server2 tcp-server2.c funcs.o
lagane$ ./tcp-server2 &
[2] 21242
lagane$
0.000000: main: parent setting up listening socket ...
0.000394: main: parent initialising children status table ...
0.000573: main: parent setting up signal handlers ...
0.000738: main: parent entering monitoring loop ...
0.000896: main: parent awaiting incoming connection ...
lagane$ 
lagane$ 
lagane$ nc localhost 2345 < /dev/null
8.333692: main: parent starting one child ...
8.333893: main: parent awaiting incoming connection ...
8.334146: start_child_server: child is pid 21249
8.334351: start_child_server: child sending message to client ...
this is a message from the server to the client
8.334701: start_child_server: child sleeping a bit ...
28.335050: start_child_server: child exiting ...
lagane$ 

The points I wanted to illustrate with that example are:

  1. accept() was not interrupted when the child server eventually exits because SIGCHLD is blocked
  2. when SIGCHLD remains blocked then we do not see exited child servers being reaped

As a consequence of that second point, if we run the nc client command a few more times, we accumulate zombie processes:

lagane$ ps fax
...
 1881 pts/7    S      0:00      \_ ./tcp-server2
 2005 pts/7    Z      0:00      |   \_ [tcp-server2] <defunct>
 2013 pts/7    Z      0:00      |   \_ [tcp-server2] <defunct>
 2027 pts/7    Z      0:00      |   \_ [tcp-server2] <defunct>
 2041 pts/7    Z      0:00      |   \_ [tcp-server2] <defunct>
 2049 pts/7    Z      0:00      |   \_ [tcp-server2] <defunct>
 2051 pts/7    Z      0:00      |   \_ [tcp-server2] <defunct>
 2059 pts/7    Z      0:00      |   \_ [tcp-server2] <defunct>
 2061 pts/7    Z      0:00      |   \_ [tcp-server2] <defunct>
 2069 pts/7    Z      0:00      |   \_ [tcp-server2] <defunct>
 2071 pts/7    Z      0:00      |   \_ [tcp-server2] <defunct>
 2076 pts/7    Z      0:00      |   \_ [tcp-server2] <defunct>
...
lagane$

Reliable signal processing in C: system-with-timeout-and-multichild-support4

So here is system-with-timeout-and-multichild-support3.c reworked to:

  • use only one signal handler
  • explicitly enable automatic restart of system calls that support it
  • make the monitoring loop as fast as possible by removing the displaying of messages

Copy and paste this source code into system-with-timeout-and-multichild-support4.c:

#include <errno.h>              /* for errno() */
#include <linux/limits.h>       /* for ARG_MAX */
#include <signal.h>             /* for signal(), SIG* */
#include <stdio.h>              /* for sprintf(), stderr, ... */
#include <stdlib.h>             /* for exit() */
#include <string.h>             /* for strerror() */
#include <sys/types.h>          /* for wait() */
#include <sys/wait.h>           /* for wait() */
#include <time.h>               /* for time() */
#include <unistd.h>             /* for sleep(), fork() and execlp() */
#include "funcs.h"              /* for infomsg(), errormsg() */

/*
 *  Macros
 */

#define MAX_CHILDREN            10000
#define CHILDREN                5000
#define TIMEOUT                 300
#define A_LONG_TIME             3600
#define CHILD_RUN_TIME(n)       1

/*
 *  Type and struct definitions
 */

struct child {
    pid_t  pid;
    time_t start;
};

/*
 *  Global variables
 */

struct child children[MAX_CHILDREN];

/*
 *  Forward declarations
 */

pid_t start_child_sleep(int);
void handler(int);

/*
 *  Functions
 */

int main(
int argc,
char *argv[])
{
    int i;
    int running_children_count, next_timeout, killed_something;
    time_t now;
    sigset_t sigset, old_sigset, suspend_sigset;
    struct sigaction act, old_chld_act, old_alrm_act, old_usr1_act;

    /*
     *  Initialise.
     */

    infomsg("parent initialising children status table ...");
    for (i=0; i<MAX_CHILDREN; i++)
        children[i].pid = 0;

    infomsg("parent setting up signal handlers ...");
    /*
     *  Define signal set for two purposes: 
     *      (1) for sigprocmask() call, 
     *      (2) for sigaction() we need list of *additional* signals to block
     *          while executing the handler (we specify all three signals, which
     *          is slightly more than *just* the additional signals, but it does
     *          no harm)
     */
    sigemptyset(&sigset);
    sigaddset(&sigset, SIGCHLD);
    sigaddset(&sigset, SIGALRM);
    sigaddset(&sigset, SIGUSR1);
    /* cause delivery of SIGCHLD, etc to be delayed until we call sigsuspend() */
    sigprocmask(SIG_BLOCK, &sigset, &old_sigset);
    /* later we need list of signals blocked before that sigprocmask() call,
       but *excluding* SIGCHLD, etc. */
    memcpy(&suspend_sigset, &old_sigset, sizeof(sigset_t));
    sigdelset(&suspend_sigset, SIGCHLD);
    sigdelset(&suspend_sigset, SIGALRM);
    sigdelset(&suspend_sigset, SIGUSR1);
    /* establish signals handlers */
    memset(&act, 0, sizeof(struct sigaction));
    act.sa_mask = sigset;
    act.sa_flags = SA_RESTART;  
    act.sa_handler = handler;
    sigaction(SIGCHLD, &act, &old_chld_act);
    sigaction(SIGALRM, &act, &old_alrm_act);
    sigaction(SIGUSR1, &act, &old_usr1_act);

    /*
     *  Start children.
     */

    infomsg("parent starting %d children ...", CHILDREN);
    for (i=0; i<CHILDREN; i++)
        start_child_sleep(CHILD_RUN_TIME(i));

    /*
     *  Main monitoring loop
     */

    infomsg("parent entering monitoring loop ...");
    while (TRUE) {
        now = time(NULL);

        /*
         *  Exit if no running children.
         */

        /* infomsg("parent checking for running children ..."); */
        running_children_count = 0;
        for (i=0; i<MAX_CHILDREN; i++)
            if (children[i].pid != 0)
                running_children_count++;
        /* infomsg("%parent sees %d children still running", running_children_count); */
        if (running_children_count == 0)
            break;

        /*
         *  Kill any children that have reached their timeout time and not
         *  been killed already.
         */

        killed_something = FALSE;
        for (i=0; i<MAX_CHILDREN; i++)
            if (children[i].pid != 0 && children[i].start != 0 &&
                    now >= children[i].start+TIMEOUT) {
                kill(children[i].pid, SIGTERM);
                children[i].start = 0;
                killed_something = TRUE;
            }

        /*
         *  Slight optimisation: if something did reach its timeout and got
         *  killed then skip to reassessing if this program can exit.
         */

        if (killed_something)
            continue;

        /*
         *  Schedule timeout alarm of next-to-timeout child.
         */

        /* infomsg("parent scheduling timeout alarm ..."); */
        next_timeout = 0;
        for (i=0; i<MAX_CHILDREN; i++)
            if (children[i].pid != 0 && children[i].start != 0)
                if (next_timeout == 0)
                    next_timeout = children[i].start+TIMEOUT - now;
                else if (children[i].start+TIMEOUT - now < next_timeout)
                    next_timeout = children[i].start+TIMEOUT - now;
        if (next_timeout >= 1) {
            /* infomsg("parent scheduling alarm for %ds ...", next_timeout); */
            alarm(next_timeout);
        }

        /*
         *  If there are dispatched-but-not-yet-delivered signals then
         *  handle them. If there are none then wait for one.
         */

        /* infomsg("parent handling any pending signals ..."); */
        if (next_timeout >= 1)
            sigsuspend(&suspend_sigset);

        /*
         *  If SIGCHLD arrived before SIGALRM then the alarm is still
         *  pending. Cancel it.
         */

        /* infomsg("parent cancelling alarm ..."); */
        alarm(0);
    }

    /*
     *  Clean up and exit.
     */

    infomsg("parent cleaning up and exiting ...");
    sigaction(SIGUSR1, &old_usr1_act, NULL);
    sigaction(SIGALRM, &old_alrm_act, NULL);
    sigaction(SIGCHLD, &old_chld_act, NULL);
    sigprocmask(SIG_SETMASK, &old_sigset, NULL);
    return(0);
}

pid_t start_child_sleep(
int period)
{
    pid_t pid;
    char buf[ARG_MAX];
    int i;

    /*
     *  Find an empty slot to store info about the process we're
     *  about launch.
     */

    for (i=0; i<MAX_CHILDREN; i++)
        if (children[i].pid == 0)
            break;
    if (i == MAX_CHILDREN)
        errormsg("start_child_sleep: unable to find a free slot");

    /*
     *  Launch a child process and note its pid and start time
     *  in the empty slot.
     */

    sprintf(buf, "sleep %d", period);
    if ((pid=fork()) < 0)
        errormsg("fork() failed: %s", strerror(errno));
    else if (pid > 0) {
        children[i].pid = pid;
        children[i].start = time(NULL);
        return(pid);
    }

    /*
     *  Only the child gets here
     */

    execlp("/bin/sh", "sh", "-c", buf, (char *) NULL);
    errormsg("exec() failed: %s", strerror(errno));
}

void handler(
int sig)
{
    int wstatus;
    pid_t pid;
    int i;

    switch (sig) {
        case SIGCHLD: while ((pid=waitpid(-1, &wstatus, WNOHANG)) > 0) {
                          /* infomsg("parent received SIGCHLD; reaping and clearing child data ..."); */
                          for (i=0; i<MAX_CHILDREN; i++)
                              if (children[i].pid == pid) {
                                  children[i].pid = 0;
                                  children[i].start = 0;
                                  break;
                              }
                      }
                      break;
        case SIGALRM: /* infomsg("parent received SIGALRM"); */
                      break;
        case SIGUSR1: finfomsg(stderr, "parent received SIGUSR1");
                      for (i=0; i<MAX_CHILDREN; i++)
                          if (children[i].pid != 0)
                              finfomsg(stderr, "slot:%04d; pid:%05d, start=%ld", 
                                       i, children[i].pid, children[i].start);
                      break;
        default:      errormsg("parent received unexpected signal %d", sig);
                      break;
    }
}

That was the final C program in this article. Next we will look at other programming languages.

Reliable signal processing in Perl

Copy and paste this source code into system-with-timeout-and-multichild-support-perl.pl:

#!/usr/bin/perl

#
#  Modules
#

use POSIX;
use Time::HiRes;
use feature 'state';   #  let doubletime() have a static variable

#
#  Global variables
#

my($MAX_CHILDREN) = 10000;
my($CHILDREN)     = 5000;
my($TIMEOUT)      = 300;
my($A_LONG_TIME)  = 3600;
sub CHILD_RUN_TIME { my($n) = @_; return 1; }
my(@children)     = ();

#
#  Functions
#

sub main
{
    my($i);
    my($running_children_count, $next_timeout, $killed_something);
    my($now);
    my($sigset, $old_sigset, $suspend_sigset);
    my($act, $old_chld_act, $old_alrm_act, $old_usr1_act);

    # 
    #  Initialise.
    # 

    &infomsg("parent initialising children status table ...");
    for ($i=0; $i<$MAX_CHILDREN; $i++) {
        %{$children[$i]} = ();
        $children[$i]{'pid'} = 0;
    }

    &infomsg("parent setting up signal handlers ...");
    # 
    #  Define signal set for two purposes: 
    #      (1) for sigprocmask() call, 
    #      (2) for sigaction() we need list of *additional* signals to block
    #          while executing the handler (we specify all three signals, which
    #          is slightly more than *just* the additional signals, but it does
    #          no harm)
    # 
    $sigset = POSIX::SigSet->new;
    $sigset->addset(POSIX::SIGCHLD);
    $sigset->addset(POSIX::SIGALRM);
    $sigset->addset(POSIX::SIGUSR1);
    $old_sigset = POSIX::SigSet->new;
    #  cause delivery of SIGCHLD, etc to be delayed until we call sigsuspend()
    POSIX::sigprocmask(SIG_BLOCK, $sigset, $old_sigset);
    #  later we need list of signals blocked before that sigprocmask() call, 
    #  but *excluding* SIGCHLD, etc.
    $suspend_sigset = $old_sigset;
    $suspend_sigset->delset(POSIX::SIGCHLD);
    $suspend_sigset->delset(POSIX::SIGALRM);
    $suspend_sigset->delset(POSIX::SIGUSR1);
    #  establish signals handlers
    $act = POSIX::SigAction->new;
    $act->mask($sigset);
    $act->flags(&POSIX::SA_RESTART);
    $act->handler(\&handler);
    $old_chld_act = POSIX::SigAction->new;
    $old_alrm_act = POSIX::SigAction->new;
    $old_usr1_act = POSIX::SigAction->new;
    POSIX::sigaction(SIGCHLD, $act, $old_chld_act);
    POSIX::sigaction(SIGALRM, $act, $old_alrm_act);
    POSIX::sigaction(SIGUSR1, $act, $old_usr1_act);

    #  
    #  Start children.
    #  

    &infomsg("parent starting %d children ...", $CHILDREN);
    for ($i=0; $i<$CHILDREN; $i++) {
        &start_child_sleep(&CHILD_RUN_TIME($i));
    }

    # 
    #  Main monitoring loop
    # 

    &infomsg("parent entering monitoring loop ...");
    while (1) {
        $now = time;
    
        # 
        #  Exit if no running children.
        # 

        &infomsg("parent checking for running children ...");
        $running_children_count = 0;
        for ($i=0; $i<$MAX_CHILDREN; $i++) {
            if ($children[$i]{'pid'} != 0) {
                $running_children_count++;
            }
        }
        &infomsg("parent sees %d children still running", $running_children_count);
        if ($running_children_count == 0) {
            last;
        }

        # 
        #  Kill any children that have reached their timeout time and not
        #  been killed already.
        # 

        $killed_something = 0;
        for ($i=0; $i<$MAX_CHILDREN; $i++) {
            if ($children[$i]{'pid'} != 0 && $children[$i]{'start'} != 0 &&
                    $now >= $children[$i]{'start'}+$TIMEOUT) {
                kill($children[$i]{'pid'}, SIGTERM);
                $children[$i]{'start'} = 0;
                $killed_something = 1;
            }
        }

        # 
        #  Slight optimisation: if something did reach its timeout and got
        #  killed then skip to reassessing if this program can exit.
        # 

        if ($killed_something) {
            next;
        }

        # 
        #  Schedule timeout alarm of next-to-timeout child.
        # 

        &infomsg("parent scheduling timeout alarm ...");
        $next_timeout = 0;
        for ($i=0; $i<$MAX_CHILDREN; $i++) {
            if ($children[$i]{'pid'} != 0 && $children[$i]{'start'} != 0) {
                if ($next_timeout == 0) {
                    $next_timeout = $children[$i]{'start'}+$TIMEOUT - $now;
                } elsif ($children[$i]{'start'}+$TIMEOUT - $now < $next_timeout) {
                    $next_timeout = $children[$i]{'start'}+$TIMEOUT - $now;
                }
            }
        }
        if ($next_timeout >= 1) {
            &infomsg("parent scheduling alarm for %ds ...", $next_timeout);
            POSIX::alarm($next_timeout);
        }

        # 
        #  If there are dispatched-but-not-yet-delivered signals then
        #  handle them. If there are none then wait for one.
        # 

        &infomsg("parent handling any pending signals ...");
        if ($next_timeout >= 1) {
            POSIX::sigsuspend($suspend_sigset);
        }

        # 
        #  If SIGCHLD arrived before SIGALRM then the alarm is still
        #  pending. Cancel it.
        # 

        &infomsg("parent cancelling alarm ...");
        POSIX::alarm(0);
    }

    #
    #  Clean up and exit.
    #

    &infomsg("parent cleaning up and exiting ...");
    #  C allows third argument to be NULL, meaning don't bother telling me
    #  what the old sig action was. But Perl insists on proper third argument.
    #  By chance we have a sig action variable that we no longer need - act -
    #  so we let sigaction() write in there. sigprocmask() on the other hand
    #  seems to accept undef as third argument.
    POSIX::sigaction(SIGUSR1, $old_usr1_act, $act);
    POSIX::sigaction(SIGALRM, $old_alrm_act, $act);
    POSIX::sigaction(SIGCHLD, $old_chld_act, $act);
    POSIX::sigprocmask(SIG_SETMASK, $old_sigset, undef);
    return 0;
}

sub start_child_sleep
{
    my($period) = @_;
    my($pid);
    my($buf);
    my($i);

    # 
    #  Find an empty slot to store info about the process we're
    #  about launch.
    # 

    for ($i=0; $i<$MAX_CHILDREN; $i++) {
        if ($children[$i]{'pid'} == 0) {
            last;
        }
    }
    if ($i == $MAX_CHILDREN) {
        &errormsg("start_child_sleep: unable to find a free slot");
    }

    # 
    #  Launch a child process and note its pid and start time
    #  in the empty slot.
    # 

    $buf = "sleep $period"
    if (($pid=fork) < 0) {
        &errormsg("fork() failed: %s", strerror(errno));
    } elsif ($pid > 0) {
        $children[$i]{'pid'} = $pid;
        $children[$i]{'start'} = time;
        return($pid);
    }

    # 
    #  Only the child gets here
    # 

    exec '/bin/sh', '-c', $buf;
    &errormsg("exec() failed: $!");
}

sub handler
{
    my($sig) = @_;
    my($pid);
    my($i);

    if ($sig eq 'CHLD') {
        while (($pid=waitpid(-1, WNOHANG)) > 0) {
            for ($i=0; $i<$MAX_CHILDREN; $i++) {
                if ($children[$i]{'pid'} == $pid) {
                    $children[$i]{'pid'} = 0;
                    $children[$i]{'start'} = 0;
                    break;
                }
            }
        }
    } elsif ($sig eq 'ALRM') {
        &infomsg("parent received SIGALRM");
    } elsif ($sig eq 'USR1') {
        &finfomsg(\*STDERR, "parent received SIGUSR1");
        for ($i=0; $i<$MAX_CHILDREN; $i++) {
            if ($children[$i]{'pid'} != 0) {
                &finfomsg(\*STDERR, "slot:%04d; pid:%05d, start=%ld", 
                          $i, $children[$i]{'pid'}, $children[$i]{'start'});
            }
        }
    } else {
        &errormsg("parent received unexpected signal %s", $sig);
    }
}

sub doubletime
{
    state $start = undef;  #  static
    my($now);

    $now = Time::HiRes::time();

    if (not defined($start)) {
        $start = $now;
    }

    return($now - $start);
}

sub infomsg
{
    my($fmt, @args) = @_;

    #  substr() strips 'main::' module prefix.
    &real_fmessage(substr((caller(1))[3],6), \*STDOUT, $fmt, @args);
}

sub errormsg
{
    my($fmt, @args) = @_;

    #  substr() strips 'main::' module prefix.
    &real_fmessage(substr((caller(1))[3],6), \*STDOUT, $fmt, @args);
    exit 1;
}

sub finfomsg
{
    my($fp, $fmt, @args) = @_;

    &real_fmessage(substr((caller(1))[3],6), $fp, $fmt, @args);
}

sub ferrormsg
{
    my($fp, $fmt, @args) = @_;

    &real_fmessage(substr((caller(1))[3],6), $fp, $fmt, @args);
    exit 1;
}

sub real_fmessage
{
    my($func, $fp, $fmt, @args) = @_;

    printf $fp "%.06lf: %s: ", &doubletime(), $func;
    printf $fp $fmt, @args;
    printf $fp "\n";
}

&main();

Compile and run the program as follows:

lagane$ cat system-with-timeout-and-multichild-support-perl.pl > \
        system-with-timeout-and-multichild-support-perl
lagane$ chmod +x system-with-timeout-and-multichild-support-perl
lagane$ ./system-with-timeout-and-multichild-support-perl
...

The points I wanted to illustrate with that example are:

  1. Perl exposes the sigset ecosystem to the programmer
  2. which means that the C and Perl are implemented identically and look very similar

Old-school signal processing in Python: things that don’t work

Python’s signal module does not expose sigaction(), etc so we have to do without it.

Copy and paste this source code into system-with-timeout-and-multichild-support-python.py:

#!/usr/bin/python3

#
#  Modules
#

import inspect
import signal
import time
import os
import sys

#
#  Macros
#

MAX_CHILDREN   = 10000
CHILDREN       = 1000
TIMEOUT        = 300
A_LONG_TIME    = 3600
CHILD_RUN_TIME = lambda n: 1

#
#  Global variables
#

children = []

#
#  Functions
#

def main():

    # 
    #  Initialise.
    # 

    infomsg('parent initialising children status table ...')
    for i in range(0,MAX_CHILDREN):
        children.append({'pid':0})

    infomsg('parent setting up signal handlers ...')
    signal.signal(signal.SIGCHLD, handler)
    signal.signal(signal.SIGALRM, handler)
    signal.signal(signal.SIGUSR1, handler)

    #  
    #  Start children.
    #  

    infomsg('parent starting %d children ...', CHILDREN)
    for i in range(0,CHILDREN):
        start_child_sleep(CHILD_RUN_TIME(i))

    # 
    #  Main monitoring loop
    # 

    infomsg('parent entering monitoring loop ...')
    while True:
        now = int(time.time())

        # 
        #  Exit if no running children.
        # 

        running_children_count = 0
        for i in range(0,MAX_CHILDREN):
            if  children[i]['pid'] != 0:
                running_children_count += 1
        if running_children_count == 0:
            break

        # 
        #  Kill any children that have reached their timeout time and not
        #  been killed already.
        # 

        killed_something = False
        for i in range(0,MAX_CHILDREN):
            if children[i]['pid'] != 0 and children[i]['start'] != 0 and \
                    now >= children[i]['start']+TIMEOUT:
                os.kill(children[i]['pid'], signal.SIGTERM)
                children[i]['start'] = 0
                killed_something = True

        # 
        #  Slight optimisation: if something did reach its timeout and got
        #  killed then skip to reassessing if this program can exit.
        # 

        if killed_something:
            continue

        # 
        #  Schedule timeout alarm of next-to-timeout child.
        # 

        next_timeout = 0
        for i in range(0,MAX_CHILDREN):
            if children[i]['pid'] != 0 and children[i]['start'] != 0:
                if next_timeout == 0:
                    next_timeout = children[i]['start']+TIMEOUT - now
                elif children[i]['start']+TIMEOUT - now < next_timeout:
                    next_timeout = children[i]['start']+TIMEOUT - now
        if next_timeout >= 1:
            signal.alarm(next_timeout)

        # 
        #  If there are dispatched-but-not-yet-delivered signals then
        #  handle them. If there are none then wait for one.
        # 

        if next_timeout >= 1:
             signal.pause()

        # 
        #  If SIGCHLD arrived before SIGALRM then the alarm is still
        #  pending. Cancel it.
        # 

        signal.alarm(0)

    #
    #  Clean up and exit.
    #

    infomsg('parent cleaning up and exiting ...')
    signal.signal(signal.SIGUSR1, signal.SIG_DFL)
    signal.signal(signal.SIGALRM, signal.SIG_DFL)
    signal.signal(signal.SIGCHLD, signal.SIG_DFL)
    return 0

def start_child_sleep(period):
    global children

    # 
    #  Find an empty slot to store info about the process we're
    #  about launch.
    # 

    for i in range(0,MAX_CHILDREN):
        if children[i]['pid'] == 0:
            break
    if i == MAX_CHILDREN:
        errormsg('start_child_sleep: unable to find a free slot')

    # 
    #  Launch a child process and note its pid and start time
    #  in the empty slot.
    # 

    buf = 'sleep %d' % (period)
    try:
        pid = os.fork()
    except:
        errormsg('fork failed')
    if pid > 0:
        children[i]['pid'] = pid
        children[i]['start'] = int(time.time())
        return pid

    # 
    #  Only the child gets here
    # 

    os.execlp('/bin/sh', 'sh', '-c', buf)
    errormsg('exec() failed')

def handler(sig, frame):
    global children

    if sig == signal.SIGCHLD:
        while True:
            #  Seems like waitpid() can either raise an exception or return 0
            try:
                pid = os.waitpid(-1, os.WNOHANG)[0]
            except:
                break
            if pid <= 0:
                break
            for i in range(0,MAX_CHILDREN):
                if children[i]['pid'] == pid:
                    children[i]['pid'] = 0
                    children[i]['start'] = 0
                    break
    elif sig == signal.SIGALRM:
        infomsg('parent received SIGALRM')
    elif sig == signal.SIGUSR1:
        finfomsg(sys.stderr, 'parent received SIGUSR1')
        for i in range(0,MAX_CHILDREN):
            if children[i]['pid'] != 0:
                finfomsg(sys.stderr, 'slot:%04d; pid:%05d, start=%ld', 
                         i, children[i]['pid'], children[i]['start'])
    else:
        errormsg('parent received unexpected signal %s', sig)

def doubletime():
    now = time.time()

    if not hasattr(doubletime, "start"):
         doubletime.start = now

    return now-doubletime.start

def infomsg(fmt, *args):
    real_fmessage(inspect.stack()[1].function, sys.stdout, fmt, *args)

def errormsg(fmt, *args):

    real_fmessage(inspect.stack()[1].function, sys.stdout, fmt, *args)
    sys.exit(1)

def finfomsg(fp, fmt, *args):

    real_fmessage(inspect.stack()[1].function, fp, fmt, *args)

def ferrormsg(fp, fmt, *args):

    real_fmessage(inspect.stack()[1].function, fp, fmt, *args)
    sys.exit(1)

def real_fmessage(func, fp, fmt, *args):

    fp.write('%.06lf: %s: ' % (doubletime(), func))
    fp.write(fmt % args)
    fp.write('\n')

main()

Compile and run it as follows:

lagane$ cat system-with-timeout-and-multichild-support-python.py > \
        system-with-timeout-and-multichild-support-python
lagane$ chmod +x system-with-timeout-and-multichild-support-python
lagane$ ./system-with-timeout-and-multichild-support-python
...

The points I wanted to illustrate with that example are:

  1. it hangs due to main() scanning children[] and finding running child processes while, effectively simultaneously, the signal handler is marking those children as having been reaped (this is exactly the same problem that system-with-timeout-and-multichild-support2 had)
  2. yes, I know that a comment in the code describes some global variables as “Macros” – remember I’m trying to help your diff tool by keeping source codes as similar as possible!

We can attempt to address this problem by getting the signal handler to request the main loop to modify children[] via a reliable messenging channel, rather than modifying children[] itself. Firstly, we try this using Python’s queue module.

Copy and paste this source code into system-with-timeout-and-multichild-support-python2.py:

#!/usr/bin/python3

#
#  Modules
#

import inspect
import signal
import time
import os
import sys
import queue

#
#  Macros
#

MAX_CHILDREN   = 10000
CHILDREN       = 1000
TIMEOUT        = 300
A_LONG_TIME    = 3600
CHILD_RUN_TIME = lambda n: 1

#
#  Global variables
#

children = []

#
#  Functions
#

def main():
    global q

    # 
    #  Initialise.
    # 

    infomsg('parent initialising children status table ...')
    for i in range(0,MAX_CHILDREN):
        children.append({'pid':0})

    infomsg('parent setting up signal handlers ...')
    q = queue.Queue()
    signal.signal(signal.SIGCHLD, handler)
    signal.signal(signal.SIGALRM, handler)
    signal.signal(signal.SIGUSR1, handler)

    #  
    #  Start children.
    #  

    infomsg('parent starting %d children ...', CHILDREN)
    for i in range(0,CHILDREN):
        start_child_sleep(CHILD_RUN_TIME(i))

    # 
    #  Main monitoring loop
    # 

    infomsg('parent entering monitoring loop ...')
    while True:
        now = int(time.time())

        #
        #  Deal with incoming messages.
        #

        while True:
            try:
                pid = q.get(False)
            except queue.Empty:
                break
            for i in range(0,MAX_CHILDREN):
                if children[i]['pid'] == pid:
                    children[i]['pid'] = 0
                    children[i]['start'] = 0
                    break
     
    
        # 
        #  Exit if no running children.
        # 

        running_children_count = 0
        for i in range(0,MAX_CHILDREN):
            if  children[i]['pid'] != 0:
                running_children_count += 1
        if running_children_count == 0:
            break

        # 
        #  Kill any children that have reached their timeout time and not
        #  been killed already.
        # 

        killed_something = False
        for i in range(0,MAX_CHILDREN):
            if children[i]['pid'] != 0 and children[i]['start'] != 0 and \
                    now >= children[i]['start']+TIMEOUT:
                os.kill(children[i]['pid'], signal.SIGTERM)
                children[i]['start'] = 0
                killed_something = True

        # 
        #  Slight optimisation: if something did reach its timeout and got
        #  killed then skip to reassessing if this program can exit.
        # 

        if killed_something:
            continue

        # 
        #  Schedule timeout alarm of next-to-timeout child.
        # 

        next_timeout = 0
        for i in range(0,MAX_CHILDREN):
            if children[i]['pid'] != 0 and children[i]['start'] != 0:
                if next_timeout == 0:
                    next_timeout = children[i]['start']+TIMEOUT - now
                elif children[i]['start']+TIMEOUT - now < next_timeout:
                    next_timeout = children[i]['start']+TIMEOUT - now
        if next_timeout >= 1:
            signal.alarm(next_timeout)

        # 
        #  If there are dispatched-but-not-yet-delivered signals then
        #  handle them. If there are none then wait for one.
        # 

        if next_timeout >= 1:
             signal.pause()

        # 
        #  If SIGCHLD arrived before SIGALRM then the alarm is still
        #  pending. Cancel it.
        # 

        signal.alarm(0)

    #
    #  Clean up and exit.
    #

    infomsg('parent cleaning up and exiting ...')
    signal.signal(signal.SIGUSR1, signal.SIG_DFL)
    signal.signal(signal.SIGALRM, signal.SIG_DFL)
    signal.signal(signal.SIGCHLD, signal.SIG_DFL)
    return 0

def start_child_sleep(period):
    global children

    # 
    #  Find an empty slot to store info about the process we're
    #  about launch.
    # 

    for i in range(0,MAX_CHILDREN):
        if children[i]['pid'] == 0:
            break
    if i == MAX_CHILDREN:
        errormsg('start_child_sleep: unable to find a free slot')

    # 
    #  Launch a child process and note its pid and start time
    #  in the empty slot.
    # 

    buf = 'sleep %d' % (period)
    try:
        pid = os.fork()
    except:
        errormsg('fork failed')
    if pid > 0:
        children[i]['pid'] = pid
        children[i]['start'] = int(time.time())
        return pid

    # 
    #  Only the child gets here
    # 

    os.execlp('/bin/sh', 'sh', '-c', buf)
    errormsg('exec() failed')

def handler(sig, frame):
    global q

    if sig == signal.SIGCHLD:
        while True:
            #  Seems like waitpid() can either raise an exception or return 0
            try:
                pid = os.waitpid(-1, os.WNOHANG)[0]
            except:
                break
            if pid <= 0:
                break
            q.put(pid)
    elif sig == signal.SIGALRM:
        infomsg('parent received SIGALRM')
    elif sig == signal.SIGUSR1:
        finfomsg(sys.stderr, 'parent received SIGUSR1')
        for i in range(0,MAX_CHILDREN):
            if children[i]['pid'] != 0:
                finfomsg(sys.stderr, 'slot:%04d; pid:%05d, start=%ld', 
                         i, children[i]['pid'], children[i]['start'])
    else:
        errormsg('parent received unexpected signal %s', sig)

def doubletime():
    now = time.time()

    if not hasattr(doubletime, "start"):
         doubletime.start = now

    return now-doubletime.start

def infomsg(fmt, *args):
    real_fmessage(inspect.stack()[1].function, sys.stdout, fmt, *args)

def errormsg(fmt, *args):

    real_fmessage(inspect.stack()[1].function, sys.stdout, fmt, *args)
    sys.exit(1)

def finfomsg(fp, fmt, *args):

    real_fmessage(inspect.stack()[1].function, fp, fmt, *args)

def ferrormsg(fp, fmt, *args):

    real_fmessage(inspect.stack()[1].function, fp, fmt, *args)
    sys.exit(1)

def real_fmessage(func, fp, fmt, *args):

    fp.write('%.06lf: %s: ' % (doubletime(), func))
    fp.write(fmt % args)
    fp.write('\n')

main()

Compile and run the program as follows:

lagane$ cat system-with-timeout-and-multichild-support-python2.py > \
        system-with-timeout-and-multichild-support-python2
lagane$ chmod +x system-with-timeout-and-multichild-support-python2
lagane$ ./system-with-timeout-and-multichild-support-python2
...

The points I wanted to illustrate with that example are:

  1. The queue module does not provide a way to implement reliable signal handling

We can try using System V IPC queues instead.

Copy and paste this source code into system-with-timeout-and-multichild-support-python3.py:

#!/usr/bin/python3

#
#  Modules
#

import inspect
import signal
import time
import os
import sys
import sysv_ipc

#
#  Macros
#

MAX_CHILDREN   = 10000
CHILDREN       = 1000
TIMEOUT        = 300
A_LONG_TIME    = 3600
CHILD_RUN_TIME = lambda n: 1

#
#  Global variables
#

children = []

#
#  Functions
#

def main():
    global q

    # 
    #  Initialise.
    # 

    infomsg('parent initialising children status table ...')
    for i in range(0,MAX_CHILDREN):
        children.append({'pid':0})

    infomsg('parent setting up signal handlers ...')
    q = sysv_ipc.MessageQueue(None, sysv_ipc.IPC_CREAT | sysv_ipc.IPC_EXCL)
    signal.signal(signal.SIGCHLD, handler)
    signal.signal(signal.SIGALRM, handler)
    signal.signal(signal.SIGUSR1, handler)

    #  
    #  Start children.
    #  

    infomsg('parent starting %d children ...', CHILDREN)
    for i in range(0,CHILDREN):
        start_child_sleep(CHILD_RUN_TIME(i))

    # 
    #  Main monitoring loop
    # 

    infomsg('parent entering monitoring loop ...')
    while True:
        now = int(time.time())

        #
        #  Deal with incoming messages.
        #

        while True:
            try:
                s, _ = q.receive(False)
                pid = int(s.decode())
            except sysv_ipc.BusyError:
                break
            for i in range(0,MAX_CHILDREN):
                if children[i]['pid'] == pid:
                    children[i]['pid'] = 0
                    children[i]['start'] = 0
                    break
     
    
        # 
        #  Exit if no running children.
        # 

        running_children_count = 0
        for i in range(0,MAX_CHILDREN):
            if  children[i]['pid'] != 0:
                running_children_count += 1
        if running_children_count == 0:
            break

        # 
        #  Kill any children that have reached their timeout time and not
        #  been killed already.
        # 

        killed_something = False
        for i in range(0,MAX_CHILDREN):
            if children[i]['pid'] != 0 and children[i]['start'] != 0 and \
                    now >= children[i]['start']+TIMEOUT:
                os.kill(children[i]['pid'], signal.SIGTERM)
                children[i]['start'] = 0
                killed_something = True

        # 
        #  Slight optimisation: if something did reach its timeout and got
        #  killed then skip to reassessing if this program can exit.
        # 

        if killed_something:
            continue

        # 
        #  Schedule timeout alarm of next-to-timeout child.
        # 

        next_timeout = 0
        for i in range(0,MAX_CHILDREN):
            if children[i]['pid'] != 0 and children[i]['start'] != 0:
                if next_timeout == 0:
                    next_timeout = children[i]['start']+TIMEOUT - now
                elif children[i]['start']+TIMEOUT - now < next_timeout:
                    next_timeout = children[i]['start']+TIMEOUT - now
        if next_timeout >= 1:
            signal.alarm(next_timeout)

        # 
        #  If there are dispatched-but-not-yet-delivered signals then
        #  handle them. If there are none then wait for one.
        # 

        if next_timeout >= 1:
             signal.pause()

        # 
        #  If SIGCHLD arrived before SIGALRM then the alarm is still
        #  pending. Cancel it.
        # 

        signal.alarm(0)

    #
    #  Clean up and exit.
    #

    infomsg('parent cleaning up and exiting ...')
    signal.signal(signal.SIGUSR1, signal.SIG_DFL)
    signal.signal(signal.SIGALRM, signal.SIG_DFL)
    signal.signal(signal.SIGCHLD, signal.SIG_DFL)
    return 0

def start_child_sleep(period):
    global children

    # 
    #  Find an empty slot to store info about the process we're
    #  about launch.
    # 

    for i in range(0,MAX_CHILDREN):
        if children[i]['pid'] == 0:
            break
    if i == MAX_CHILDREN:
        errormsg('start_child_sleep: unable to find a free slot')

    # 
    #  Launch a child process and note its pid and start time
    #  in the empty slot.
    # 

    buf = 'sleep %d' % (period)
    try:
        pid = os.fork()
    except:
        errormsg('fork failed')
    if pid > 0:
        children[i]['pid'] = pid
        children[i]['start'] = int(time.time())
        return pid

    # 
    #  Only the child gets here
    # 

    os.execlp('/bin/sh', 'sh', '-c', buf)
    errormsg('exec() failed')

def handler(sig, frame):
    global q

    if sig == signal.SIGCHLD:
        while True:
            #  Seems like waitpid() can either raise an exception or return 0
            try:
                pid = os.waitpid(-1, os.WNOHANG)[0]
            except:
                break
            if pid <= 0:
                break
            q.send(str(pid))
    elif sig == signal.SIGALRM:
        infomsg('parent received SIGALRM')
    elif sig == signal.SIGUSR1:
        finfomsg(sys.stderr, 'parent received SIGUSR1')
        for i in range(0,MAX_CHILDREN):
            if children[i]['pid'] != 0:
                finfomsg(sys.stderr, 'slot:%04d; pid:%05d, start=%ld', 
                         i, children[i]['pid'], children[i]['start'])
    else:
        errormsg('parent received unexpected signal %s', sig)

def doubletime():
    now = time.time()

    if not hasattr(doubletime, "start"):
         doubletime.start = now

    return now-doubletime.start

def infomsg(fmt, *args):
    real_fmessage(inspect.stack()[1].function, sys.stdout, fmt, *args)

def errormsg(fmt, *args):

    real_fmessage(inspect.stack()[1].function, sys.stdout, fmt, *args)
    sys.exit(1)

def finfomsg(fp, fmt, *args):

    real_fmessage(inspect.stack()[1].function, fp, fmt, *args)

def ferrormsg(fp, fmt, *args):

    real_fmessage(inspect.stack()[1].function, fp, fmt, *args)
    sys.exit(1)

def real_fmessage(func, fp, fmt, *args):

    fp.write('%.06lf: %s: ' % (doubletime(), func))
    fp.write(fmt % args)
    fp.write('\n')

main()

Compile and run the program as follows:

lagane$ pip3 install sysv-ipc
lagane$ cat system-with-timeout-and-multichild-support-python3.py > \
        system-with-timeout-and-multichild-support-python3
lagane$ chmod +x system-with-timeout-and-multichild-support-python3
lagane$ ./system-with-timeout-and-multichild-support-python3
...

The points I wanted to illustrate with that example are:

  1. The sysv_ipc module does not provide a way to implement reliable signal handling

Reliable signal processing in Python

pysigset provides wrappers around the OS’s sigset ecosystem and it works! It’s badly documented but hopefully that will be fixed.

The pysigset module may be available for your OS/distribution. If it is then install it using your package manager otherwise install it by running:

lagane$ pip3 install pysigset
lagane$

Copy and paste this source code into system-with-timeout-and-multichild-support-python4.py:

#!/usr/bin/python3

#
#  Modules
#

import inspect
import signal
import time
import os
import sys
import pysigset

#
#  Macros
#

MAX_CHILDREN   = 10000
CHILDREN       = 1000
TIMEOUT        = 300
A_LONG_TIME    = 3600
CHILD_RUN_TIME = lambda n: 1

#
#  Global variables
#

children = []

#
#  Functions
#

def main():

    # 
    #  Initialise.
    # 

    infomsg('parent initialising children status table ...')
    for i in range(0,MAX_CHILDREN):
        children.append({'pid':0})

    infomsg('parent setting up signal handlers ...')
    # 
    #  Define signal set for two purposes: 
    #      (1) for sigprocmask() call, 
    #      (2) for sigaction() we need list of *additional* signals to block
    #          while executing the handler (we specify all three signals, which
    #          is slightly more than *just* the additional signals, but it does
    #          no harm)
    # 

    sigset = pysigset.SIGSET()
    pysigset.sigaddset(sigset, signal.SIGCHLD)
    pysigset.sigaddset(sigset, signal.SIGALRM)
    pysigset.sigaddset(sigset, signal.SIGUSR1)
    #  cause delivery of SIGCHLD, etc to be delayed until we call sigsuspend()
    old_sigset = pysigset.SIGSET()
    pysigset.sigprocmask(signal.SIG_BLOCK, sigset, old_sigset)
    #  later we need list of signals blocked before that sigprocmask() call,
    #  but *excluding* SIGCHLD, etc.
    suspend_sigset = old_sigset
    pysigset.sigdelset(suspend_sigset, signal.SIGCHLD)
    pysigset.sigdelset(suspend_sigset, signal.SIGALRM)
    pysigset.sigdelset(suspend_sigset, signal.SIGUSR1)
    #  establish signals handlers
    signal.signal(signal.SIGCHLD, handler)
    signal.signal(signal.SIGALRM, handler)
    signal.signal(signal.SIGUSR1, handler)

    #  
    #  Start children.
    #  

    infomsg('parent starting %d children ...', CHILDREN)
    for i in range(0,CHILDREN):
        start_child_sleep(CHILD_RUN_TIME(i))

    # 
    #  Main monitoring loop
    # 

    infomsg('parent entering monitoring loop ...')
    while True:
        now = int(time.time())

        # 
        #  Exit if no running children.
        # 

        running_children_count = 0
        for i in range(0,MAX_CHILDREN):
            if  children[i]['pid'] != 0:
                running_children_count += 1
        if running_children_count == 0:
            break

        # 
        #  Kill any children that have reached their timeout time and not
        #  been killed already.
        # 

        killed_something = False
        for i in range(0,MAX_CHILDREN):
            if children[i]['pid'] != 0 and children[i]['start'] != 0 and \
                    now >= children[i]['start']+TIMEOUT:
                os.kill(children[i]['pid'], signal.SIGTERM)
                children[i]['start'] = 0
                killed_something = True

        # 
        #  Slight optimisation: if something did reach its timeout and got
        #  killed then skip to reassessing if this program can exit.
        # 

        if killed_something:
            continue

        # 
        #  Schedule timeout alarm of next-to-timeout child.
        # 

        next_timeout = 0
        for i in range(0,MAX_CHILDREN):
            if children[i]['pid'] != 0 and children[i]['start'] != 0:
                if next_timeout == 0:
                    next_timeout = children[i]['start']+TIMEOUT - now
                elif children[i]['start']+TIMEOUT - now < next_timeout:
                    next_timeout = children[i]['start']+TIMEOUT - now
        if next_timeout >= 1:
            signal.alarm(next_timeout)

        # 
        #  If there are dispatched-but-not-yet-delivered signals then
        #  handle them. If there are none then wait for one.
        # 

        if next_timeout >= 1:
             pysigset.sigsuspend(suspend_sigset)

        # 
        #  If SIGCHLD arrived before SIGALRM then the alarm is still
        #  pending. Cancel it.
        # 

        signal.alarm(0)

    #
    #  Clean up and exit.
    #

    infomsg('parent cleaning up and exiting ...')
    signal.signal(signal.SIGUSR1, signal.SIG_DFL)
    signal.signal(signal.SIGALRM, signal.SIG_DFL)
    signal.signal(signal.SIGCHLD, signal.SIG_DFL)
    pysigset.sigprocmask(signal.SIG_SETMASK, old_sigset, None)
    return 0

def start_child_sleep(period):
    global children

    # 
    #  Find an empty slot to store info about the process we're
    #  about launch.
    # 

    for i in range(0,MAX_CHILDREN):
        if children[i]['pid'] == 0:
            break
    if i == MAX_CHILDREN:
        errormsg('start_child_sleep: unable to find a free slot')

    # 
    #  Launch a child process and note its pid and start time
    #  in the empty slot.
    # 

    buf = 'sleep %d' % (period)
    try:
        pid = os.fork()
    except:
        errormsg('fork failed')
    if pid > 0:
        children[i]['pid'] = pid
        children[i]['start'] = int(time.time())
        return pid

    # 
    #  Only the child gets here
    # 

    os.execlp('/bin/sh', 'sh', '-c', buf)
    errormsg('exec() failed')

def handler(sig, frame):
    global children

    if sig == signal.SIGCHLD:
        while True:
            #  Seems like waitpid() can either raise an exception or return 0
            try:
                pid = os.waitpid(-1, os.WNOHANG)[0]
            except:
                break
            if pid <= 0:
                break
            for i in range(0,MAX_CHILDREN):
                if children[i]['pid'] == pid:
                    children[i]['pid'] = 0
                    children[i]['start'] = 0
                    break
    elif sig == signal.SIGALRM:
        infomsg('parent received SIGALRM')
    elif sig == signal.SIGUSR1:
        finfomsg(sys.stderr, 'parent received SIGUSR1')
        for i in range(0,MAX_CHILDREN):
            if children[i]['pid'] != 0:
                finfomsg(sys.stderr, 'slot:%04d; pid:%05d, start=%ld', 
                         i, children[i]['pid'], children[i]['start'])
    else:
        errormsg('parent received unexpected signal %s', sig)

def doubletime():
    now = time.time()

    if not hasattr(doubletime, "start"):
         doubletime.start = now

    return now-doubletime.start

def infomsg(fmt, *args):
    real_fmessage(inspect.stack()[1].function, sys.stdout, fmt, *args)

def errormsg(fmt, *args):

    real_fmessage(inspect.stack()[1].function, sys.stdout, fmt, *args)
    sys.exit(1)

def finfomsg(fp, fmt, *args):

    real_fmessage(inspect.stack()[1].function, fp, fmt, *args)

def ferrormsg(fp, fmt, *args):

    real_fmessage(inspect.stack()[1].function, fp, fmt, *args)
    sys.exit(1)

def real_fmessage(func, fp, fmt, *args):

    fp.write('%.06lf: %s: ' % (doubletime(), func))
    fp.write(fmt % args)
    fp.write('\n')

main()

Compile and run the program as follows:

lagane$ cat system-with-timeout-and-multichild-support-python4.py > \
        system-with-timeout-and-multichild-support-python4
lagane$ chmod +x system-with-timeout-and-multichild-support-python4
lagane$ ./hanger system-with-timeout-and-multichild-support-python4
<no-output>

The points I wanted to illustrate with that example are:

  1. pysigset provides reliable signal handling in Python

Conclusions

  1. use the sigset ecosystem; it is much more robust and allows finer control (e.g. SA_RESTART flag) than the signal() system call
  2. sigprocmask() and sigsuspend() allow us to define a particular place in the main loop where signals can be handled safely and effectively synchronously

    See also