Posted by: Airtower | 2010-06-16

Catch SIGTERM, exit gracefully

I knew that programs can catch a SIGTERM and exit gracefully. What I didn’t know is how to do that, and that it’s actually quite simple. You need just two things:

  1. A function that will cause your program to exit gracefully
  2. The sigaction() function and the struct of the same name, defined in the system header signal.h.

Example

#include <signal.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>

volatile sig_atomic_t done = 0;

void term(int signum)
{
	done = 1;
}

int main(int argc, char *argv[])
{
	struct sigaction action;
	memset(&action, 0, sizeof(struct sigaction));
	action.sa_handler = term;
	sigaction(SIGTERM, &action, NULL);

	int loop = 0;
	while (!done)
	{
		int t = sleep(3);
		/* sleep returns the number of seconds left if
		 * interrupted */
		while (t > 0)
		{
			printf("Loop run was interrupted with %d "
			       "sec to go, finishing...\n", t);
			t = sleep(t);
		}
		printf("Finished loop run %d.\n", loop++);
	}

	printf("done.\n");
	return 0;
}

How does it work?

sigaction(SIGTERM, &action, NULL) sets the action to be performed for the SIGTERM signal to the term() function. This means that term() will be called when the program receives a SIGTERM. It will set the global variable done to 1, and that will cause the loop to stop after finishing the current run. That’s it!

Hints

  1. sig_atomic_t is a special type that guarantees that reads and writes are atomic, so the assignment to done cannot be interrupted, e.g. if another signal arrives while the handler is running. If you use glibc, it’s probably identical to int.
  2. done is declared volatile to let the compiler know it might change asynchronously. Otherwise an optimizing compiler may assume that since done does not change inside the loop the check can be omitted, creating an endless loop.
  3. Extending the point above, a signal handler function can be called at unpredictable times and must not mess up shared data structures. This restricts what kinds of calls you can safely make inside a handler. The signal(7) manpage has a list of Async-signal-safe functions.
  4. I wrote this example in such a way that term() will be called only for SIGTERM, but in a real program you should check the value of signum.
  5. You should also check the return value of sigaction() to make sure the call was successful.
  6. The third argument to sigaction() could be a struct sigaction* to store the previously set action.

Of course, you can add handlers for other signals the same way. Take a look at the sigaction(2) manpage! struct sigaction also has some more options to configure signal handling.

Running the example

I compiled the example into sigterm-example. Starting it and sending a SIGTERM by using

$ kill PID

(replace PID with the actual process ID) results in the following output:

$ ./sigterm-example
Finished loop run 0.
Finished loop run 1.
Finished loop run 2.
Loop run was interrupted with 2s to go, finishing...
Finished loop run 3.
Done.

Of course, the “Loop run was interrupted […]” message will only be printed if the SIGTERM actually hits during sleep().

To show the difference between SIGTERM and SIGKILL (which cannot be caught), another run with kill -SIGKILL PID:

$ ./sigterm-example
Finished loop run 0.
Finished loop run 1.
Finished loop run 2.
Killed

You can see that SIGTERM lets the program finish its work, while SIGKILL forces it to terminate immediately. Try adding a handler for SIGKILL if you don’t believe me! 😛 If a bug in your SIGTERM handling makes the program refuse to stop, you’ll know what to do. 😉

This post has been updated on 2013-05-23 after I noticed that the signal(2) manpage discourages the use of signal() in favor of sigaction().

Update 2014-10-02: Declare done as volatile sig_atomic_t, unsafe printf() call has been removed from the example handler and a comment about async signal safety has been added (Thanks, Anders!).

Advertisements

Responses

  1. Thanks for the guide!

    A few remarks:

    Most calls are unsafe in a signal handler, and this includes printf(). In particular, if the signal is caught while you’re inside printf() (small chance each time but it will happen eventually), you’ll end up calling it twice at the same time which might not be a good idea.

    Also, since the “done” variable isn’t volatile, the compiler would be within its right to assume that the variable cannot change during execution of the loop, and might optimize the repeated check away. Furthermore, according to the standard the behavior is undefined unless your variable is defined as “volatile sig_atomic_t”.

    Personally, I’d also make “done” static, to make sure that external code can’t fake a SIGTERM by modifying the variable. They’d have to go the proper way and actually send the signal, which also causes sleep() to return immediately instead of waiting the three seconds.

    Finally, while you’re correct that a SIGKILL can’t be caught, please note that you’re not setting up a signal handler for it. It’s not surprising that you’re not catching signals that you’re not even attempting to catch.

    • Thank you for the detailed comment! You can find my opinion on the issues you mention below.

      Using printf() in the signal handler: Good point, though it’s not an issue for example code. I’ll add a comment and reference link regarding what is safe to do in signal handlers later.

      Making the loop variable a sig_atomic_t and volatile: Again, good points. In practice, however, int is atomic on all platforms supported by glibc, so unless someone has a really exotic system, the type is more of an academic issue. 🙂

      Making done static: That really depends on the use case. Changing done doesn’t really simulate SIGTERM, just end the loop after the current run, and there may well be a legitimate reason to do so. Either way, I think that topic is really out of scope for a signal handling example.

      SIGKILL: I think it is obvious that there’s no handler for SIGKILL in the example. The output listing with SIGKILL is not about proving it can’t be caught, but rather to show the difference between controlled shutdown and immediate kill. Maybe I’ll add a remark that readers can try catching SIGKILL if they don’t believe me. 😀

      I’m going to update the post some time soon. 😉

    • The update is complete now. In the end I decided to remove the printf() call from the handler, because example code should not set a bad example. 😉 Thanks again!

  2. A quick clarification would the above code be able to exit gracefully if
    1. sudo reboot
    2. sudo poweroff

    is executed ?

    • Technically it depends on your init system (e.g. System-V init or Systemd). In practice all init systems I know of with will do something like the following:

      1. Send SIGTERM to all running processes. SIGTERM handlers will be called at this point.
      2. Wait a certain time for processes to stop.
      3. SIGKILL any process that are still running.

      So yes, your SIGTERM handler will be called during an orderly shutdown, but make sure it does not take too long to terminate. Depending on what your code does, you might want to set it up as a system service. If you need details or some very specific behaviour look at the documentation of your init system.


Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Categories

%d bloggers like this: