Re: "rc" shell maintainer?
On 05 Feb 1998 05:01:30 GMT, smarry@pantransit.smar.reptiles.org wrote:
> Hello, this is Marc Moorcroft. I joined the list shortly after
> running into two problems with rc-1.5b2 under Linux.
>
> One definite bug (reported to Tim Goodwin) will cause rc to spin calling
> wait(2) if you do:
>
> {ls & wait} | cat
>
> There is another problem with signal handling that is a little more
> complicated. When I upgraded to the 2.0.33 kernel, rc began hanging
> occasionally when I interrupted programs, and when I finally got irritated
> enough to check it thoroughly, I found that it failed trip.rc at:
>
> kill -2 $pid
>
> The relevant code is rc_wait() in wait.c:
>
> static pid_t rc_wait(int *stat) {
> int r;
> interrupt_happened = FALSE;
> if (!setjmp(slowbuf.j)) {
> slow = TRUE;
> if (!interrupt_happened)
> r = wait(stat);
> else
> r = -1;
> } else
> r = -1;
> slow = FALSE;
> return r;
> }
>
> It appears that some of the time, Linux will return from the wait(2)
> for the 'kill' process before the signal gets delivered. On Linux
> installations where signal(2) has the System V behaviour (system calls
> are interrupted for signals that are caught via signal(2)) rc longjmps
> out of the signal handler (a rather alarming practice in itself) to the
> top of the enclosing code in rc_wait(). The sequence of events appears
> to be:
>
> The signal is sent,
>
> the process exits,
>
> wait(2) returns successfully, and
>
> before the longjmp gadgetry can be turned off (slow = FALSE),
> the signal handler IMMEDIATELY runs,
>
> longjmps back to the top of the setjmp block,
>
> and the PID that wait(2) returned is lost. rc loops forever calling
> wait(2) with no children, waiting for the lost PID to turn up.
Shouldn't the "interrupt_happened" flag prevent this?
> I've talked to others who have had different problems on other Linux
> installations, where caught signals do not interrupt system calls, as
> in BSD. This appears to be due to a difference of opinion between the
> libc and glibc people about how signals should behave, but I haven't
> investigated it myself.
It sounds like you're running RedHat 5.0 or some other distribution
which uses glibc. I haven't taken that step, partly on the "beware of
version X.0 of anything" and partly because I'm waiting until I get a
new machine. Sounds like this is another good reason, having heard
lots of complaints of problems with glibc.
The ultimate solution maybe to move to the world of sigaction/sigblock
where those calls are available. The signal handling in rc is one of
the hairyest aspects of the code due to portability issues and race
conditions. Byron spent a lot of time (the change logs are full of
it) fixing race conditions and signal handling before he passed on the
torch. (BTW - Anyone heard from him recently?)
Tom
--
TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
debian-devel-request@lists.debian.org .
Trouble? e-mail to templin@bucknell.edu .
Reply to: