[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Realloc is blocking execution



Hi,

On Fri, 16 Oct 2009, Mats Erik Andersson wrote:

2. The window manager WM responds to a user request by issuing

   execlp( /bin/sh -c "pkill -HUP WM" )


4. The main process WM receives SIGHUP, and enters a signal handler.
  The signal handler uses two calls: free_menuitems(), get_menuitems().

5. The original structure menuitems is successfully returned
  to the memory heap for allocation. This was free_menuitems().

6. Call get_menuitems():

  a.   menutitems = malloc(1024)

  b.   Fill menuitems exactly like was done in step 0.


0. During initialisation of WM, these steps are performed:

     menutitems = malloc(1024) // 64 items of 16 bytes each

     Parse a user resource file, thereby filling a small number
     of items with two pointers, each pointing to strings allocated
     separately by malloc(). In fact,

           struct { char * a, * b;
                    int c, d;
              } menuitems[];


  c.   menuitems = realloc(menuitems, 1024)

       SLEEP-BLOCKING !!!!!!



Thus the main process seems to call at least the following functions in a
signal handler that is running due to the delivery of an asynchronous signal:
- free()
- malloc()
- fopen() and other stdio functions

This is very non-portable. (If I recall correctly, the main topic in this thread is the portability of the WM, as it works elsewhere but not on Debian Lenny.) See [0] and [1]:

    Many of the other functions that are excluded from the list are
    traditionally implemented using either malloc() or free() functions or
    the standard I/O library, both of which traditionally use data
    structures in a non-reentrant manner. Since any combination of different
    functions using a common data structure can cause reentrancy problems,
    this volume of IEEE Std 1003.1-2001 does not define the behavior when
    any unsafe function is called in a signal handler that interrupts an
    unsafe function.

    If the signal occurs other than as the result of calling abort(),
    kill(), or raise(), the behavior is undefined if the signal handler
    calls any function in the standard library other than one of the
    functions listed in the table above or refers to any object with static
    storage duration other than by assigning a value to a static storage
    duration variable of type volatile sig_atomic_t.

(Asynchronous) signals are software interrupts. You want to get out of the interrupt handler as fast & clean as you can. Signal handlers are not run-of-the-mill callbacks, they can be called (unless masked) by the kernel between any pair of adjacent machine-level instructions of program text, by externally manipulating the program's stack and instruction pointer.

My suggestion: since you have an event loop anyway, you surely block in a function that can be interrupted by signals. You should block signals everywhere else in the program (as any other part of the program shouldn't block waiting for resources, thus there is no need to allow signals to be delivered anywhere else).

A good function for this is pselect() [2]. In the signal handler for SIGHUP, you should only set a variable like

    static volatile sig_atomic_t caught_hup;

and after pselect() returns, check/reset this variable. If a HUP signal was caught, do the cleanup / reconfiguration on the normal stack, not on the signal stack.

pselect() is the king of functions in a way. It can synchronously multiplex on more or less *all* kinds of input a UNIX process can consume: file descriptors, timeouts (you order them and choose the first), and signals. pselect() is there to be the core of your event loop.

$ man select_tut
/COMBINING SIGNAL AND DATA EVENTS

Keywords (in a single threaded process): sigemptyset(), sigaddset(), sigprocmask(), sigaction(), pselect().

Cheers,
lacos


[0] http://www.opengroup.org/onlinepubs/000095399/functions/xsh_chap02_04.html#tag_02_04_03
[1] http://www.opengroup.org/onlinepubs/000095399/functions/sigaction.html#tag_03_680_07
[2] http://www.opengroup.org/onlinepubs/000095399/functions/pselect.html


Reply to: