I've observed that most of these X server crashes have a stack backtrace
that looks like this:
#0 0x00007fde06264107 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007fde062654e8 in __GI_abort () at abort.c:89
#2 0x00007fde089e49e1 in OsAbort () at ../../os/utils.c:1361
#3 0x00007fde0883f86e in ddxGiveUp (error=EXIT_ERR_ABORT) at ../../../../hw/xfree86/common/xf86Init.c:1088
#4 0x00007fde0883f996 in AbortDDX (error=EXIT_ERR_ABORT) at ../../../../hw/xfree86/common/xf86Init.c:1132
#5 0x00007fde089ee370 in AbortServer () at ../../os/log.c:783
#6 0x00007fde089ee8b1 in FatalError (f=0x7fde08a1de68 "Caught signal %d (%s). Server aborting\n") at ../../os/log.c:924
#7 0x00007fde089e1653 in OsSigHandler (signo=11, sip=0x7fff53cb6530, unused=0x7fff53cb6400) at ../../os/osinit.c:147
#8 <signal handler called>
#9 0x00007fde0889cd87 in xf86CursorSetCursor (pDev=0x7fde0a9bb260, pScreen=0x7fde0a8d2110, pCurs=0x7fde0ac7b280, x=772, y=596) at ../../../../hw/xfree86/ramdac/xf86Cursor.c:332
#10 0x00007fde0889c9e6 in xf86CursorEnableDisableFBAccess (pScrn=0x7fde0a894420, enable=1) at ../../../../hw/xfree86/ramdac/xf86Cursor.c:232
#11 0x00007fde00be4ef2 in ?? () from /usr/lib/xorg/modules/drivers/nvidia_drv.so
#12 0x00007fde00bdc791 in ?? () from /usr/lib/xorg/modules/drivers/nvidia_drv.so
#13 0x00007fde0883afdb in xf86VTEnter () at ../../../../hw/xfree86/common/xf86Events.c:581
#14 0x00007fde0883b0c8 in xf86VTSwitch () at ../../../../hw/xfree86/common/xf86Events.c:633
#15 0x00007fde0883a5ad in xf86Wakeup (blockData=0x0, err=-1, pReadmask=0x7fde08c85500 <LastSelectMask>) at ../../../../hw/xfree86/common/xf86Events.c:291
#16 0x00007fde087e5d0e in WakeupHandler (result=-1, pReadmask=0x7fde08c85500 <LastSelectMask>) at ../../dix/dixutils.c:423
#17 0x00007fde089d70f0 in WaitForSomething (pClientsReady=0x7fde0abc4dd0) at ../../os/WaitFor.c:229
#18 0x00007fde087d5edd in Dispatch () at ../../dix/dispatch.c:361
#19 0x00007fde087e4f70 in dix_main (argc=14, argv=0x7fff53cb6ef8, envp=0x7fff53cb6f70) at ../../dix/main.c:296
#20 0x00007fde087c5fc8 in main (argc=14, argv=0x7fff53cb6ef8, envp=0x7fff53cb6f70) at ../../dix/stubmain.c:34
I don't know what the closed-source nvidia_drv.so does in #11 and #12.
But in #10 I applied the appended brute-force patch to see what happens
and, lo and behold, no crashes after a hundred times switching user and
two days of doing normal work!
This patch may introduce a small memory leak - I don't know. But the
machine doesn't freeze any more!
@Aaron: Do you still think this is a bug in Xorg?