[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: trivfs



> Shell 1:
> neal@hurd:~/test $ settrans -ac tt trivial_translator
> neal@hurd:~/test $ echo "foo" > tt
> neal@hurd:~/test $ settrans -g tt
> Hang!

But can you C-z settrans?  What if you start it with & and then use ps to
see what is happening?  As a rule (this should be in a faw if it's not), if
ps hangs then C-c it and add -M to the switches.

> Shell 2:
> neal@hurd:~ $ kill 52
> Hang!
> 
> No contorl c, no control z.

Ah yes.  This is a known problem that I should fix (probably will in glibc
2.2): sending a signal (other than SIGKILL) to a botched process
(precisely, one whose signal thread is not responding to the msg_sig_post
RPC) will wedge the caller of `kill' (the function, in this case that's
/bin/kill or bash) so it can't be interrupted properly.  You can always use
kill -9 (SIGKILL), because that terminates the task directly rather than 
sending it a message.  Also, once you have terminated the offending
process, anyone blocked in kill ought to wake up.

> Shell 3:
> I had an extra terminal open this time and was able to
> kill the login shells and then:
> neal@hurd:~ $ kill -9 52
> neal@hurd:~ $

This was the right thing to do.  Always have an extra terminal, and use ps
to examine the state of the world from your extra terminal while things are
hung.

> Reading symbols from /lib/ld.so...done.
> 
> [1]+  Stopped                 gdb trivial_translator 52

That is gdb crashing.  You do a `jobs' it ought to show you the actual
signal on which it stopped (the fact that it doesn't here is a bash
nuisance, unless it really was SIGTSTP which seems unlikely).  Which gdb
version is it?  I believe we are usually harrassing Mark Kettenis about
gdb, so maybe he can look into this for you.  To see where it is crashing,
you can always gdb gdb and attach to the crashing gdb process.

If you want first to work around the gdb problem, then here are some things
to try.  Start gdb with only one argument, so it doesn't attach
immediately.  Then do `set auto-solib-add 0' and then do `attach PID'.
Hopefully you will be able to look at the state a little, at least to do
`info regs' and `backtrace'.  You can then do `info shared' and `shared LIB'
to load symbols of the shared libraries you need one or a few at a time, on
the assumption that something about loading their symbols is what is crashing.
(It's also the case that gdb will behave a bit differently as to what
inferior memory references it tries to make depending on whether or not it has
symbols loaded for the spot where the inferior crashes, so it might be one
of these references that causes gdb to die and so you can see the state
without symbols and not crash.)

> neal@hurd:~/test $ fg
> gdb trivial_translator 52
> ---Type <return> to continue, or q <return> to quit---
> 
> GDB does not resond to either a return, a `q', a control c or
> control z; It must be killed from another shell.

In all cases of wedgitude, please show us the output of ps, ps -l, and ps -lT
on the offending process(es).  It may or may not be enlightening.


Reply to: