[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#221969: Newer libc6.1 cannot correctly return from signal handlers



severity 221969 grave
thanks

[ please excuse if this note is duplicated ]

Greetings!  GCL and its compiled programs (maxima, acl2, axiom) can
make use of a more efficient garbage collection algorithm by
mprotecting certain pages read only, trapping segmentation faults, and
then reprotecting the pages read write as necessary.  This has been
working on alpha for a long time.  acl2 in particular was last build
on 20021210 on lully successfully using this mechanism.  I've made
some minor modifications to the acl2 package, and now the build breaks
on alpha only, apparently.

Debugging on escher shows that apparently the stack is damaged in such
a way as the program cannot determine where to return after the signal
handler, in particular the return is one stack level up from where the
segmentation signal was caught.  

Here is a gdb session:

=============================================================================
Starting program: /home/camm/acl2-2.7/raw_nsaved_acl2 /home/camm/gcl-2.6.1/unixport/ <init_nsaved_acl2.lsp
GCL (GNU Common Lisp)  April 1994  131072 pages
Building symbol table for /home/camm/acl2-2.7/raw_nsaved_acl2 ..
loading /home/camm/gcl-2.6.1/unixport/../lsp/gcl_export.lsp
...
[GC for 5242 RELOCATABLE-BLOCKS pages..(T=764).GC finished]
[SGC on]
Program received signal SIGSEGV, Segmentation fault.
0x0000000120667b8c in LI20 (V83=0x120d74570) at gcl_sloop.c:530
530		bds_bind(VV[34],(V83));
(gdb) bt 5
#0  0x0000000120667b8c in LI20 (V83=0x120d74570) at gcl_sloop.c:530
#1  0x0000000120666ad8 in L7 () at gcl_sloop.c:230
#2  0x00000001204eb984 in funcall (fun=0x120b51380) at eval.c:173
#3  0x00000001205336ac in IapplyVector (fun=0x120b51380, nargs=2, 
    base=0x12084ea10) at nfunlink.c:239
#4  0x00000001204ef2f8 in fLfuncall (fun=0x120b51380) at eval.c:1140
(More stack frames follow...)
(gdb) up
#1  0x0000000120666ad8 in L7 () at gcl_sloop.c:230
230		base[3]= (*(LnkLI247))(base[2]);
(gdb) p base
$35 = (object *) 0x12084ea10
(gdb) p &base
$36 = (object **) 0x11fffe6f8
(gdb) c
Continuing.

Breakpoint 1, memprotect_handler (sig=11, code=4831830752, scp=0x11fffe360, 
    addr=0x0) at sgbc.c:1404
1404	  int j=page_multiple;
(gdb) p *(object **) 0x11fffe6f8
$37 = (union lispunion **) 0x12084ea10
(gdb) display *(object **) 0x11fffe6f8
1: *(union lispunion ***) 4831831800 = (union lispunion **) 0x12084ea10
(gdb) n
1408	  faddr=GET_FAULT_ADDR(sig,code,scp,addr); 
1: *(union lispunion ***) 4831831800 = (union lispunion **) 0x12084ea10
(gdb) 
1409	  debug_fault = (long) faddr;
1: *(union lispunion ***) 4831831800 = (union lispunion **) 0x12084ea10
(gdb) 
1413	  if (faddr >= core_end || (unsigned long)faddr < DBEGIN) {
1: *(union lispunion ***) 4831831800 = (union lispunion **) 0x12084ea10
(gdb) 
1422	  p = page(faddr);
1: *(union lispunion ***) 4831831800 = (union lispunion **) 0x12084ea10
(gdb) 
1423	  p = ROUND_DOWN_PAGE_NO(p);
1: *(union lispunion ***) 4831831800 = (union lispunion **) 0x12084ea10
(gdb) 
1424	  if (p >= first_protectable_page
1: *(union lispunion ***) 4831831800 = (union lispunion **) 0x12084ea10
(gdb) 
1433	    mprotect(pagetochar(p),page_multiple * PAGESIZE, PROT_READ_WRITE);
1: *(union lispunion ***) 4831831800 = (union lispunion **) 0x12084ea10
(gdb) 
1434	    while (--j >= 0)
1: *(union lispunion ***) 4831831800 = (union lispunion **) 0x12084ea10
(gdb) p j
$38 = 1
(gdb) n
1435	      sgc_type_map[p+j] = sgc_type_map[p+j] | SGC_TEMP_WRITABLE;
1: *(union lispunion ***) 4831831800 = (union lispunion **) 0x12084ea10
(gdb) 
1434	    while (--j >= 0)
1: *(union lispunion ***) 4831831800 = (union lispunion **) 0x12084ea10
(gdb) 
1457	}
1: *(union lispunion ***) 4831831800 = (union lispunion **) 0x12084ea10
(gdb) bt
#0  memprotect_handler (sig=11, code=4831830752, scp=0x11fffe360, addr=0x0)
    at sgbc.c:1457
#1  0x000002000022c858 in __open_catalog () from /lib/libc.so.6.1
#2  0x0000000120666ad8 in L7 () at gcl_sloop.c:230
(gdb) c
Continuing.

Program received signal SIGSEGV, Segmentation fault.
0x0000000120666ae4 in L7 () at gcl_sloop.c:230
230		base[3]= (*(LnkLI247))(base[2]);
1: *(union lispunion ***) 4831831800 = (union lispunion **) 0x12084ea10
(gdb) bt
#0  0x0000000120666ae4 in L7 () at gcl_sloop.c:230
(gdb) p &base
$39 = (object **) 0x12080b74c
(gdb) p base[0]
Cannot access memory at address 0x20add6c800000000
(gdb) 
=============================================================================

In fact, one can also simply see this by running the same gcl image
under the stable and unstable environments on escher.  The command
(si::sgc-on t) succeeds in the former but fails in the latter,
reporting most commonly a segmentation fault but also occasionally an
illegal instruction.

One can also see this by downloading the most recent certified build
of acl2 in unstable and testing, and running in escher's unstable
build environment.  It crashes on startup with an (incorrectly caught)
segmentation fault.

Take care,


-- System Information:
Debian Release: testing/unstable
Architecture: alpha
Kernel: Linux escher 2.4.24escher #1 Tue Jan 13 16:41:32 CET 2004 alpha
Locale: LANG=C, LC_CTYPE=C

Versions of packages libc6.1 depends on:
ii  libdb1-compat                 2.1.3-7    The Berkeley database routines [gl

-- no debconf information
-- 
Camm Maguire			     			camm@enhanced.com
==========================================================================
"The earth is but one country, and mankind its citizens."  --  Baha'u'llah



Reply to: