[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Still want to believe.



Uhm, this mail was sitting in my outbox for a couple of months. Perhaps
somebody else can make something out of it, this start-stop-daemon bug
is really annoying...

On Thu, Nov 20, 2003 at 06:56:44PM +0100, Santiago Vila wrote:
> > sshd dumps a core
> 
> Not for me. I got segfaults from dpkg trying to upgrade it from a
> previous version. I wrote a dummy start-stop-daemon which did
> nothing, like this: "#!/bin/sh", put it in /local/bin, and managed to
> upgrade it. After this it has always worked flawlessly.

This is not only a problem of ssh, it generally happens when --pidfile
points to a non-existant PID:

raptor:/home/mbanck# echo 53 > /var/run/true.pid
raptor:/home/mbanck# /sbin/start-stop-daemon --start --quiet --pidfile /var/run/true.pid --exec /usr/bin/true 
Segmentation fault

This is the backtrace:

Program received signal SIGSEGV, Segmentation fault.
0x0102c593 in proc_stat_list_pid_proc_stat () from /lib/libps.so.0.3
(gdb) bt full
#0  proc_stat_list_pid_proc_stat (pp=0x0, pid=53) at ../../libps/proclist.c:199
        nprocs = 16874284
        procs = (struct proc_stat **) 0x35
#1  0x08049bad in pid_is_cmd (pid=0, name=0x0) at ../../utils/start-stop-daemon.c:663
        pstat = (struct proc_stat *) 0x35
#2  0x08049d1c in check (pid=53) at ../../utils/start-stop-daemon.c:705
No locals.
#3  0x08049d8e in do_pidfile (name=0x1018063 "/var/run/true.pid") at ../../utils/start-stop-daemon.c:716
        f = (FILE *) 0x804c8d0
        pid = 53
#4  0x08049edb in do_findprocs () at ../../utils/start-stop-daemon.c:935
No locals.
#5  0x0804a591 in main (argc=0, argv=0x1017bf4) at ../../utils/start-stop-daemon.c:1176
        devnull_fd = -1
        tty_fd = -1
(gdb) p pp
$12 = (struct proc_stat_list *) 0x0
(gdb) p *pp
Cannot access memory at address 0x0

Unlike on Linux, sshd on the Hurd exits after being started on bootup, I
got this in the syslog:

Nov 23 02:21:54 raptor sshd[100]: socket: Protocol family not supported
Nov 23 02:21:54 raptor sshd[100]: debug1: Bind to port 22 on 0.0.0.0.
Nov 23 02:21:54 raptor sshd[100]: Server listening on 0.0.0.0 port 22.

But PID 100 does not exist anymore (but I can ssh in fine). On the other
hand, /var/run/sshd.pid still says '100'.

It seems start-stop-daemon does not setup 'pp' correctly. I tried
various fixes, but wasn't really sure about what should be the correct
behaviour, start-stop-daemon.c seems to be too less docuemented for me
:-/

Apparently (as the code in question is clearly Hurd specific and inside
a #ifdef(OSHURD) ), somebody already took a look at this. The ChangeLog
does not seem to cover this, does anybody know what's going on and what
should be done?


Michael



Reply to: