[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: libc6.1 signal handling bug on alpha



Greetings!

Paul Slootman <paul@debian.org> writes:

> On Tue 24 Feb 2004, Camm Maguire wrote:
> 
> > Greetings!  Just wondering if I should cripple gcl/maxima/acl2/axiom
> > on alpha to work around this bug, or if its going to get fixed
> > sometime soon.....
> 
> Don't hold your breath... :-(

:-)  Seriously though, this is a pity, as its a major bug which has
only recently been introduced on alpha only.  I like alpha.  I think
it should be one of the strongest Debian options.  But there don't
appear to be even as many users as m68k :-(.

> I don't expect that a kernel change would be responsible for this;
> my alpha is still running 2.4.19-rc2 (the latest at the time of last
> boot :-) and I guess you see the same problem there. Perhaps some libc
> change?

Yes, this is a libc6.1 issue only.  The signal is being delivered
(kernel), but sigaction (libc6.1) can't setup the handler to return to
the right place. 

Take care,

> 
> Paul Slootman
> 
> > =============================================================================
> > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=221969
> > =============================================================================
> > 
> >     * To: paul@debian.org, debian-alpha@lists.debian.org,camm@m.enhanced.com
> >     * Subject: Apparent signal handling errors recently introduced
> >     * From: Camm Maguire <camm@m.enhanced.com>
> >     * Date: Fri, 06 Feb 2004 12:15:02 -0500
> >     * Message-id: <[🔎] E1Ap9ZG-0006Y9-00@wisdom.m.enhanced.com>
> >     * Old-return-path: <camm@m.enhanced.com>
> > 
> > Greetings!  GCL and its compiled programs (maxima, acl2, axiom) can
> > make use of a more efficient garbage collection algorithm by
> > mprotecting certain pages read only, trapping segmentation faults, and
> > then reprotecting the pages read write as necessary.  This has been
> > working on alpha for a long time.  acl2 in particular was last build
> > on 20021210 on lully successfully using this mechanism.  I've made
> > some minor modifications to the acl2 package, and now the build breaks
> > on alpha only, apparently.
> > 
> > Debugging on escher shows that apparently the stack is damaged in such
> > a way as the program cannot determine where to return after the signal
> > handler, in particular the return is one stack level up from where the
> > segmentation signal was caught.  Is there any recent kernel change
> > that could have caused this?
> > 
> > Here is a gdb session:
> > 
> > Starting program: /home/camm/acl2-2.7/raw_nsaved_acl2 /home/camm/gcl-2.6.1/unixport/ <init_nsaved_acl2.lsp
> > GCL (GNU Common Lisp)  April 1994  131072 pages
> > Building symbol table for /home/camm/acl2-2.7/raw_nsaved_acl2 ..
> > loading /home/camm/gcl-2.6.1/unixport/../lsp/gcl_export.lsp
> > ...
> > [GC for 5242 RELOCATABLE-BLOCKS pages..(T=764).GC finished]
> > [SGC on]
> > Program received signal SIGSEGV, Segmentation fault.
> > 0x0000000120667b8c in LI20 (V83=0x120d74570) at gcl_sloop.c:530
> > 530		bds_bind(VV[34],(V83));
> > (gdb) bt 5
> > #0  0x0000000120667b8c in LI20 (V83=0x120d74570) at gcl_sloop.c:530
> > #1  0x0000000120666ad8 in L7 () at gcl_sloop.c:230
> > #2  0x00000001204eb984 in funcall (fun=0x120b51380) at eval.c:173
> > #3  0x00000001205336ac in IapplyVector (fun=0x120b51380, nargs=2, 
> >     base=0x12084ea10) at nfunlink.c:239
> > #4  0x00000001204ef2f8 in fLfuncall (fun=0x120b51380) at eval.c:1140
> > (More stack frames follow...)
> > (gdb) up
> > #1  0x0000000120666ad8 in L7 () at gcl_sloop.c:230
> > 230		base[3]= (*(LnkLI247))(base[2]);
> > (gdb) p base
> > $35 = (object *) 0x12084ea10
> > (gdb) p &base
> > $36 = (object **) 0x11fffe6f8
> > (gdb) c
> > Continuing.
> > 
> > Breakpoint 1, memprotect_handler (sig=11, code=4831830752, scp=0x11fffe360, 
> >     addr=0x0) at sgbc.c:1404
> > 1404	  int j=page_multiple;
> > (gdb) p *(object **) 0x11fffe6f8
> > $37 = (union lispunion **) 0x12084ea10
> > (gdb) display *(object **) 0x11fffe6f8
> > 1: *(union lispunion ***) 4831831800 = (union lispunion **) 0x12084ea10
> > (gdb) n
> > 1408	  faddr=GET_FAULT_ADDR(sig,code,scp,addr); 
> > 1: *(union lispunion ***) 4831831800 = (union lispunion **) 0x12084ea10
> > (gdb) 
> > 1409	  debug_fault = (long) faddr;
> > 1: *(union lispunion ***) 4831831800 = (union lispunion **) 0x12084ea10
> > (gdb) 
> > 1413	  if (faddr >= core_end || (unsigned long)faddr < DBEGIN) {
> > 1: *(union lispunion ***) 4831831800 = (union lispunion **) 0x12084ea10
> > (gdb) 
> > 1422	  p = page(faddr);
> > 1: *(union lispunion ***) 4831831800 = (union lispunion **) 0x12084ea10
> > (gdb) 
> > 1423	  p = ROUND_DOWN_PAGE_NO(p);
> > 1: *(union lispunion ***) 4831831800 = (union lispunion **) 0x12084ea10
> > (gdb) 
> > 1424	  if (p >= first_protectable_page
> > 1: *(union lispunion ***) 4831831800 = (union lispunion **) 0x12084ea10
> > (gdb) 
> > 1433	    mprotect(pagetochar(p),page_multiple * PAGESIZE, PROT_READ_WRITE);
> > 1: *(union lispunion ***) 4831831800 = (union lispunion **) 0x12084ea10
> > (gdb) 
> > 1434	    while (--j >= 0)
> > 1: *(union lispunion ***) 4831831800 = (union lispunion **) 0x12084ea10
> > (gdb) p j
> > $38 = 1
> > (gdb) n
> > 1435	      sgc_type_map[p+j] = sgc_type_map[p+j] | SGC_TEMP_WRITABLE;
> > 1: *(union lispunion ***) 4831831800 = (union lispunion **) 0x12084ea10
> > (gdb) 
> > 1434	    while (--j >= 0)
> > 1: *(union lispunion ***) 4831831800 = (union lispunion **) 0x12084ea10
> > (gdb) 
> > 1457	}
> > 1: *(union lispunion ***) 4831831800 = (union lispunion **) 0x12084ea10
> > (gdb) bt
> > #0  memprotect_handler (sig=11, code=4831830752, scp=0x11fffe360, addr=0x0)
> >     at sgbc.c:1457
> > #1  0x000002000022c858 in __open_catalog () from /lib/libc.so.6.1
> > #2  0x0000000120666ad8 in L7 () at gcl_sloop.c:230
> > (gdb) c
> > Continuing.
> > 
> > Program received signal SIGSEGV, Segmentation fault.
> > 0x0000000120666ae4 in L7 () at gcl_sloop.c:230
> > 230		base[3]= (*(LnkLI247))(base[2]);
> > 1: *(union lispunion ***) 4831831800 = (union lispunion **) 0x12084ea10
> > (gdb) bt
> > #0  0x0000000120666ae4 in L7 () at gcl_sloop.c:230
> > (gdb) p &base
> > $39 = (object **) 0x12080b74c
> > (gdb) p base[0]
> > Cannot access memory at address 0x20add6c800000000
> > (gdb) 
> 
> 
> 

-- 
Camm Maguire			     			camm@enhanced.com
==========================================================================
"The earth is but one country, and mankind its citizens."  --  Baha'u'llah



Reply to: