Bug#278724: [Sarge]: xserver-xfree86 freeze on errors from gv|xmms|lopster

retitle 278724 xserver-xfree86: [nv] system hang when using gv, xmms, or lopster on NV5 [Aladdin TNT2] rev 0x20
severity 278724 important
tag 278724 + moreinfo upstream

On Fri, Oct 29, 2004 at 12:58:09AM +0200, nasr.laili@tin.it wrote:
> While testing a few packages running under X I experienced a total
> freeze of by box: as this happened with xmms, lopster and - today - with
> gv, the doubt is coming out that it might depend on the X Window System,
> although I cannot tell for sure.
> In case you need any additional info I could obtain (which I'm not aware
> of at the moment) please feel free to ask.  May I add, however, that the
> same type of freeze happened with lopster (when I tried to use something
> not available) and with xmms, clicking on 'play' before choosing a file
> to play, so it should not be too hard to reproduce the error.

Thanks for your report.

We appreciate you sending along your XF86Config-4 file, but we need a
little more information.

1) Can you please mail <278724@bugs.debian.org> the contents of your
   /var/log/XFree86.0.log file?  A copy of the log file after the lockup
   has occurred (but before the X server has started again) would be most
   valuable.  If you run a display manager like gdm, or xdm, this may be
   most easily obtained by rebooting into single-user mode after the box
   locks, and copying /var/log/XFree86.0.log to a safe location while in
   single-user mode.

2) Can you reproduce the problem with xserver-xfree86-dbg?  I am attaching
   a form letter that describes how to debug the X server.  This might
   enable us to track down the piece of code that is causing the lockup.

[The following is a form letter.]

Can you reproduce the problem with xserver-xfree86-dbg?  Install the
package and tell debconf you want to use that X server.  Then restart the X
server and try to reproduce the bug (should be easy).  If it doesn't crash,
let us know.  If it does crash, become root, enable core dumps ("ulimit -c
unlimited" in bash), start the X server as root and reproduce the crash

# startx $(which x-terminal-emulator) -- :1

(If no X server is running at DISPLAY=:0, you can leave off the "-- :1"

This will launch the X server running a lone terminal client with no window
manager.  Run the client that provokes the crash from the terminal prompt.
If the X server crashes, it should leave a core dump in /etc/X11.

We then run the GNU Debugger, GDB, on the core file and executable.  We're
interested in a backtrace of execution.  The X server has a signal handler
in it so it can do things like exit gracefully (restoring the text console,
and so forth), so we're not actually interested in all the stack frames --
just those "above" the signal handler.

Here's an example GDB session I logged after provoking an artificial server
crash (with "kill -SEGV").

  % gdb $(which XFree86-debug) core
  GNU gdb 6.1-debian
  Copyright 2004 Free Software Foundation, Inc.
  GDB is free software, covered by the GNU General Public License, and you are
  welcome to change it and/or distribute copies of it under certain conditions.
  Type "show copying" to see the conditions.
  There is absolutely no warranty for GDB.  Type "show warranty" for details.
  This GDB was configured as "i386-linux"...Using host libthread_db library "/lib/libthread_db.so.1".

  Core was generated by `/usr/X11R6/bin/X :1'.
  Program terminated with signal 6, Aborted.
  Reading symbols from /usr/lib/libfreetype.so.6...done.
  Loaded symbols for /usr/lib/libfreetype.so.6
  Reading symbols from /usr/lib/libz.so.1...done.
  Loaded symbols for /usr/lib/libz.so.1
  Reading symbols from /lib/libm.so.6...done.
  Loaded symbols for /lib/libm.so.6
  Reading symbols from /lib/libc.so.6...done.
  Loaded symbols for /lib/libc.so.6
  Reading symbols from /lib/ld-linux.so.2...done.
  Loaded symbols for /lib/ld-linux.so.2
  #0  0x400f2721 in kill () from /lib/libc.so.6
  (gdb) bt
  #0  0x400f2721 in kill () from /lib/libc.so.6
  #1  0x400f24c5 in raise () from /lib/libc.so.6
  #2  0x400f39e8 in abort () from /lib/libc.so.6
  #3  0x08464b8c in ddxGiveUp () at xf86Init.c:1173
  #4  0x08464c6b in AbortDDX () at xf86Init.c:1224
  #5  0x08508bd7 in AbortServer () at utils.c:436
  #6  0x0850a563 in FatalError (f=0x8a26ea0 "Caught signal %d.  Server aborting\n") at utils.c:1421
  #7  0x0847fbf5 in xf86SigHandler (signo=11) at xf86Events.c:1198
  #8  <signal handler called>
  #9  0x40199dd2 in select () from /lib/libc.so.6
  #10 0x401f8550 in ?? () from /lib/libc.so.6
  #11 0x400164a0 in ?? () from /lib/ld-linux.so.2
  #12 0xbffff8f0 in ?? ()
  #13 0x08502520 in WaitForSomething (pClientsReady=0xbffff944) at WaitFor.c:350
  #14 0x084cff54 in Dispatch () at dispatch.c:379
  #15 0x084e763c in main (argc=2, argv=0xbffffe04, envp=0xbffffe10) at main.c:469
  (gdb) bt full -7
  #9  0x40199dd2 in select () from /lib/libc.so.6
  No symbol table info available.
  #10 0x401f8550 in ?? () from /lib/libc.so.6
  No symbol table info available.
  #11 0x400164a0 in ?? () from /lib/ld-linux.so.2
  No symbol table info available.
  #12 0xbffff8f0 in ?? ()
  No symbol table info available.
  #13 0x08502520 in WaitForSomething (pClientsReady=0xbffff944) at WaitFor.c:350
          i = 2
          waittime = {tv_sec = 118, tv_usec = 580000}
          wt = (struct timeval *) 0xbffff910
          timeout = 599999
          standbyTimeout = 1199999
          suspendTimeout = 1799999
          offTimeout = 2399999
          clientsReadable = {fds_bits = {0 <repeats 32 times>}}
          clientsWritable = {fds_bits = {1, 146318192, -1073743800, 140704020, 147350456, 147350040, 2, 312, 1, 1075418973, -1073743800, 139461033, 147374816, 1, -1073743680, 9, 1073833120, -1073742332, 
      -1073743784, 139526463, 9, -1073743680, 1, 139458611, 147350456, 147350040, -1073743752, 139529154, 147339744, -1073743680, 1, 1074655182}}
          curclient = 147556952
          selecterr = 3
          nready = 0
          devicesReadable = {fds_bits = {1, 1, 6, 146327832, 147350508, 0, 315, 302, 9, 3, 315, 302, 9, 3, 0, 0, 146318192, 1075807568, -1073743880, 137843170, 146125816, 3, 313, 147556952, 0, 15066597, 3, 
      -1, 147350500, 1, 0, 146319268}}
          now = 16069
          someReady = 0
  #14 0x084cff54 in Dispatch () at dispatch.c:379
          clientReady = (int *) 0xbffff944
          result = 0
          client = 0x8c8c2e0
          nready = -1
          icheck = (HWEventQueuePtr *) 0x8b45c68
          start_tick = 940
  #15 0x084e763c in main (argc=2, argv=0xbffffe04, envp=0xbffffe10) at main.c:469
          i = 1
          j = 2
          k = 2
          error = -1073742332
          xauthfile = 0xbfffffba "/root/.Xauthority"
          alwaysCheckForInput = {0, 1}
  (gdb) quit

In the example above, you can see I used "bt full -7" to get the
"outermost" seven stack frames, complete with local variable information,
where available.

If you could send us something smiliar, that would be very helpful.

