[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: "rc" shell maintainer?



On 05 Feb 1998 05:01:30 GMT, smarry@pantransit.smar.reptiles.org wrote:
> Hello, this is Marc Moorcroft.  I joined the list shortly after
> running into two problems with rc-1.5b2 under Linux.
> 
> One definite bug (reported to Tim Goodwin) will cause rc to spin calling
> wait(2) if you do:
> 
> {ls & wait} | cat
> 
> There is another problem with signal handling that is a little more
> complicated.  When I upgraded to the 2.0.33 kernel, rc began hanging
> occasionally when I interrupted programs, and when I finally got irritated
> enough to check it thoroughly, I found that it failed trip.rc at:
> 
> kill -2 $pid
> 
> The relevant code is rc_wait() in wait.c:
> 
> static pid_t rc_wait(int *stat) {
> 	int r;
> 	interrupt_happened = FALSE;
> 	if (!setjmp(slowbuf.j)) {
> 		slow = TRUE;
> 		if (!interrupt_happened)
> 			r = wait(stat);
> 		else
> 			r = -1;
> 	} else
> 		r = -1;
> 	slow = FALSE;
> 	return r;
> }
> 
> It appears that some of the time, Linux will return from the wait(2)
> for the 'kill' process before the signal gets delivered.  On Linux
> installations where signal(2) has the System V behaviour (system calls
> are interrupted for signals that are caught via signal(2)) rc longjmps
> out of the signal handler (a rather alarming practice in itself) to the
> top of the enclosing code in rc_wait().  The sequence of events appears
> to be:
> 
> 	The signal is sent,
> 
> 	the process exits,
> 
> 	wait(2) returns successfully, and
> 
> 	before the longjmp gadgetry can be turned off (slow = FALSE),
> 	the signal handler IMMEDIATELY runs,
> 
> 	longjmps back to the top of the setjmp block,
> 
> and the PID that wait(2) returned is lost.  rc loops forever calling
> wait(2) with no children, waiting for the lost PID to turn up.

Shouldn't the "interrupt_happened" flag prevent this?

> I've talked to others who have had different problems on other Linux
> installations, where caught signals do not interrupt system calls, as
> in BSD.  This appears to be due to a difference of opinion between the
> libc and glibc people about how signals should behave, but I haven't
> investigated it myself.

It sounds like you're running RedHat 5.0 or some other distribution
which uses glibc.  I haven't taken that step, partly on the "beware of
version X.0 of anything" and partly because I'm waiting until I get a
new machine.  Sounds like this is another good reason, having heard
lots of complaints of problems with glibc.

The ultimate solution maybe to move to the world of sigaction/sigblock
where those calls are available.  The signal handling in rc is one of
the hairyest aspects of the code due to portability issues and race
conditions.  Byron spent a lot of time (the change logs are full of
it) fixing race conditions and signal handling before he passed on the
torch.  (BTW - Anyone heard from him recently?)

Tom


--
TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
debian-devel-request@lists.debian.org . 
Trouble?  e-mail to templin@bucknell.edu .


Reply to: