Hi Thomas :) Quoting Thomas Schwinge (2014-09-12 09:16:32) > In my QEMU Debian GNU/Hurd instance, I do some stuff (GDB testsuite, for > example), and then issue a »sudo shutdown -r now«. This doesn't > complete, but hangs like this: > > PID UID PPID PGrp Sess TH Vmem RSS %CPU User System Args > [...] > 28386 0 1 28386 28386 2 147M 1.67M 0.0 0:00.00 0:00.01 /bin/sh /etc/init.d/rc 6 > 28391 0 28386 28386 28386 2 146M 1000K 0.0 0:00.01 0:00.00 /lib/startpar/startpar -p 4 -t 20 -T 3 -M stop -P 2 -R 6 > 28413 0 28391 28386 28386 2 147M 1.72M 0.0 0:00.00 0:00.00 /bin/sh /etc/init.d/sendsigs stop > 28418 0 28413 28386 28386 2 146M 668K 0.0 0:00.00 0:00.00 sync > [...] > > The sync is hanging; confirmed with a manually run »syncfs -s« which also > hangs. Right, I also see this. > What seems suspicious is the following: > > 814 0 5 1 1 15 141M 984K 99.8 0:00.23 11:47.19 /hurd/console > > ..., and indeed if I »kill -KILL« that one, some time later (?), the > shutdown proceeds. My /hurd/console isn't chewing CPU time, though I see the kernel complaining about it deallocating an invalid port. I'll see if I can track this down. I see however some process crashing, and the crash server failing hard: Thread 1 (Thread 4931.1): #0 0x010b1a4c in mach_msg_trap () at /usr/src/glibc-2.19/build-tree/hurd-i386-libc/mach/mach_msg_trap.S:2 #1 0x010b222e in __mach_msg (msg=msg@entry=0x1604920, option=option@entry=3, send_size=send_size@entry=48, rcv_size=rcv_size@entry=32, rcv_name=65, timeout=timeout@entry=0, notify=notify@entry=0) at msg.c:110 #2 0x0127a889 in __msg_sig_post (process=11, signal=6, sigcode=sigcode@entry=0, refport=1) ---Type <return> to continue, or q <return> to quit--- at /usr/src/glibc-2.19/build-tree/hurd-i386-libc/hurd/RPC_msg_sig_post.c:143 #3 0x010f04a9 in kill_port (refport=<optimized out>, msgport=<optimized out>) at ../sysdeps/mach/hurd/kill.c:67 #4 kill_pid (pid=pid@entry=4931) at ../sysdeps/mach/hurd/kill.c:104 #5 0x010f0761 in __kill (pid=4931, sig=sig@entry=6) at ../sysdeps/mach/hurd/kill.c:138 #6 0x010efb94 in raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 #7 0x010f2116 in abort () at abort.c:89 #8 0x0112d05c in __libc_message (do_abort=do_abort@entry=2, fmt=fmt@entry=0x1210082 "*** %s ***: %s terminated\n") at ../sysdeps/posix/libc_fatal.c:175 #9 0x011d44e0 in __fortify_fail ( msg=msg@entry=0x121006a "stack smashing detected") at fortify_fail.c:31 #10 0x011d449a in __stack_chk_fail () at stack_chk_fail.c:28 #11 0x0804be88 in dump_core (task=2, signo=1410542606, sigcode=650000, sigerror=0) at ../../exec/elfcore.c:566 #12 0x00000019 in ?? () Backtrace stopped: previous frame inner to this frame (corrupt stack?) Curious, does your /servers/crash point to crash-dump-core as well? Killing the crash server I managed to get my shutdown process going again, though it got stuck again. Using the kernel debugger (at this point, sysvinit succesfully killed all my shells) I could see that indeed another crash server has been spawned. Cheers, Justus
Attachment:
signature.asc
Description: signature