Thursday, March 23, 2006

Canceling Worker Threads


Here's an old post of mine on thread cancelation techniques. The message is a bit old but the conclusion still holds, I think. In short, the question is how does one cancel a worker thread. Here are some options from that post (slightly revised):

1) Asynchronous cancelation: The general consensus on comp.programming.threads is that this is bad. The target thread will be left in an indeterminate state and, if it was doing anything even remotely interesting, you'll likely experience resource leaks or crashes.

2) Deferred cancelation: Nice, standard way of going about cancelation but it suffers from some serious issues. First, target threads can still block in poll and select and conform to the POSIX standard (neither is on the list of cancelation points under POSIX 1996). Second, cancelation cleanup under pthreads and cleanup under C++ do not usually work together. On most platform/compiler combinations a canceled thread will execute pthread cancelation handlers but will not execute catch blocks or destructors for local objects.

3) Signals: This solution seems to work well but only under some rather strict guidelines - your library client should not use signals, should not create threads except via your library and should not make use of other libraries that create threads. However, if this suits you fine then cancelation using signals allows you to wake a thread stuck in a blocking call - errno will return EINTR - which your client's code can handle as it sees fit. If your client returns or throws in response to the failure C++ catch blocks and destructors will be called as expected.

Note: The standard (3.3.1.4 Signal Effects on Other Functions) has this to say about functions that are interrupted by signal handlers: "If the signal-catching function executes a return the behavior of the interrupted function shall be as described individually for that function". I guess this means that assuming the function will fail or that errno will return EINTR as a result of the failure may not be portable (although that's what you get on Linux).

4) Pipes or local sockets: IMO, the least intrusive solution of the bunch if you're code follows the reactor pattern. Your poll or select call includes a descriptor that is only used to wake it when the thread is to be canceled. This approach returns from blocked select and poll calls, works well with C++ but does nothing for you if your thread is stuck in anything other than select or poll. I've taken this approach for my reactor based code noting that clients should not make calls that take "too long".


1 comment:

Unknown said...

Deep stuff.