[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#469058: DF and signal handlers

Nikodemus Siivola a ?crit :
> On 3/5/08, Aurelien Jarno <aurelien at aurel32.net> wrote:
>> Nikodemus Siivola a ?crit :
>>> On 3/5/08, Debian Bug Tracking System <owner at bugs.debian.org> wrote:
>>  >
>>  >> tag 469058 + patch
>>  >>  Bug#469058: sbcl doesn't reset direction flag upon exit
>>  >>  There were no tags set.
>>  >>  Tags added: patch
>>  >
>>  > Thanks for the patch, but... while I agree that it is good to change
>>  > SBCL to reset the direction flag every time it is diddled, instead of
>>  > just before calling C, I don't think SBCL is actually at fault here.
>>  >
>>  >  1. SBCL does actually reset DF before any call to foreign (GCC generated) code.
>>  >     See line 236 in src/compiler/x86/c-call.lisp, and line 125 in
>>  >     src/runtime/x86-assem.S.
>>  >
>>  >     (It is possible I'm missing out a call-path here, but even so, read on and
>>  >     see if my fears are unfounded or not.)
>>  >
>>  >  2. If the problem was due to a foreign call, it should be deterministic.
>>  >
>>  >  3. If the problem was due to _returning_ to main(), it should be deterministic.
>> Looks correct.
>>  > What I suspect is actually going on (especially considering your
>>  > statement that compiling signals/ with 4.2 avoided the issue) is that
>>  > a signal handler is entered while DF is set.
>> What I am sure is that sigemptyset() from the glibc is called with the
>> direction flag set, and that should not happen.
> Right.
> I'm about to merge a patch to SBCL based on yours, which moves all DF
> resets to immediate vicinity of STDs for easier auditing, and removed
> the then-unnecessary CLD instructions from foreign call sequences.
> This will fix them symptoms, and be good for SBCL, but I think the
> underlying problem is still there in signal handling. :/
>>  > If this is the case, then clearing it right after each REP loop where
>>  > SBCL uses it just makes seeing the bug much more unlikely -- but not
>>  > impossible in the presence of async signals.
>> Seems correct, though I have made half a dozen of build here, without
>> any problem.
> That is not too suprising: the are normally no asynch signals
> delivered during the build, but SIGSEGV is a regular occurance (it is
> used by the GC), so SIGSEGV handlers may have been seeing the DF set.
> What _is_ strange is that this appears to have been random. (At least
> all the reporters seemed to characterize it as semirandom behaviour.)
> Multiple builds from the same source with the same host compiler
> should have essentially identical GC characteristics.

Well it may depends on the kernel. On one machine, it was hanging
randomly. On another machine, I get an error from GC at the very
beginning of the build.

>>  > If so, this may also explain some _very_ hard to reproduce faults we
>>  > have seen over the years: using a pre 4.3-GCC compiled libc, a signal
>>  > at an in opportune moment in the middle of a REP loop could clear DF!
>>  > Yikes!
>>  >
>>  > I'm not sure what is The Right Thing here, though. Should SBCL (and
>>  > _any_ program that ever sets DF!) save, clear, and restore DF in its
>>  > signal handlers? Should libc/kernel do that? Should signals be blocked
>> I currently have no idea about that.
> I'll see if I can cook up a small test-case using async signals. (One
> that doesn't need SBCL so that it can be passed to upstream libc /
> kernel people if necessary without too much friction.)

GCC developer says it's the job of the kernel. I doubt the glibc can do
something here, that's the kernel which calls the signal handler.

  .''`.  Aurelien Jarno	            | GPG: 1024D/F1BCDB73
 : :' :  Debian developer           | Electrical Engineer
 `. `'   aurel32 at debian.org         | aurelien at aurel32.net
   `-    people.debian.org/~aurel32 | www.aurel32.net

Reply to: