Re: The crippled resurrection of said etch.
On Thu, Oct 26, 2006 at 10:25:37AM -0400, Matthew Krauss wrote:
> Hi, I haven't been following this to closely so I may be missing
> something, but this message caught my eye. I'm not sure how experienced
> you are, so I will try to be very explicit -- if I tell you things you
> think are obvious, please forgive me.
Thanks. No forgiveness needed. If anything, I appreciate this level of
detail. I've found that, when explaining things to others, it's hard to
guess the proper level of detail.
> email@example.com wrote:
> >On Wed, Oct 25, 2006 at 12:34:02PM -0700, Andrew Sackville-West wrote:
> >>firstname.lastname@example.org wrote:
> >>>Four more reboots, one successful.
> >>>It seems to ba a problem starting gdm.
> Hmm... It sounds like a race condition, obviously.
I *thought* that was a possibility.
> From what I have
> read in this thread, I would guess that there is a very good probability
> that you have an old startup script laying around from a package that
> has been otherwise removed or upgraded.
Is there any way to search for such stray files? There were dome bugs
in upgrade scripts a few months when X underwent two revolutions in a
> >>it could be an X problem or a gdm problem, but probably I'd guess X.
> >>>It tell sme it's starting gdm,
> >>>then that it'snot starting kdm because it's not the default,
> >>>then that it's not starting (presumably another *dm) because it's
> >>>not the default
> >>thats normal. X does sanity checks to make sure you're not starting more
> >>than one session manager or whatever.
> >I know that. I just thought that the last message before the crash
> >might be a clue to what went wrong -- such as an unfortunate race
> >condition between gdm and whatever thing decides not to start the other.
> >But I admit this is unlikely.
I thought is unlikely because, as far as I know, these *dm startup
scripts check whether they are default *before* they start anything up.
> >>>then the black screen of death, preventing me from reading which other
> >>>*dm it was considering.
> >>are you locked up hard at that point or can you switch to a vt?
> >Locked up hard. THough I suppose I should try ssh-ing in.
> When you say "black screen of death" I assume you mean a kernel panic?
The screen goes completely black. No text visible.
If I recall correctly, a kernel panic usually puts a kernel panic
message on the bottom of the screen. But of course, perhaps it's not
displaying the kernel logging screen when it dies.
> If so, ssh-ing won't work.
Therefore worth a try. Give us a further clue whether it might be a
> Also, notably, a kernel panic should *never*
> happen (theoretically!) -- it is always the result of either a kernel
> bug or a hardware failure. No user-space program should be able to
> cause a kernel panic.
> What I would try to isolate the problem is:
> 1. Reboot in to single user mode.
Which I do by specifying "etch 1" at the lilo boot promot.
It works. The on-screen messages call it maintenance mode, though.
I presume that's the same mode.
> 2. Log in as root.
Will do the rest later in the day when my users are gone.
> 3. Try starting X alone:
> $ X 2>&1 | less
> 3a. If X starts, you may kill it with ctrl-alt-backspace;
> 3b. If X does not start, you have the output to debug;
> 3c. If you get a kernel panic, you know you have serious X problems.
> 4. Next try starting gdm directly:
> $ /etc/init.d/gdm start
> 4a. If gdm starts, there is probably a problem in your startup scripts;
> 4b. If gdm does not start, you can check the logs under /var/log/gdm/
> 4c. If you get a kernel panic, you know you have serious gdm problems.
> In the case of 4a., where you have a problem in your startup scripts:
> 5. Kill gdm -- use ctrl-alt-F1 to return to your terminal, and issue:
> $ /etc/init.d/gdm stop
> 6. Switch to the default runlevels rc directory and ls it:
> $ cd /etc/rc2.d
> $ ls
> See all the links named S##*
> .. where ## is a number
> .. and * is the rest of the name?
> At startup, these are all started in the order of the ## numbers.
> Scripts with the same number as gdm start at the same time.
> These are good candidates for a race condition.
> For instance, I have:
> You probably have all of these, plus:
> .. and others?
> 7. Try starting up the scripts with the same number as gdm in various
> orders. Consider which ones sound likely to be the problem. For
> instance, you have guessed that another *dm is your problem, so try
> starting first xdm and then gdm, then the other way around. If you make
> a crash, congratulations!
> Oh, to start a script, ie. S99gdm, use:
> $ ./S99gdm start
> S99rc.local actually runs /etc/rc.local which might have anything in it,
> so that is worth looking in to. You should probably look at
> /etc/rc.local and see what it is doing.
> Scripts with other numbers are possible too -- just less likely -- so
> you may want to try them if you don't find the problem in the "good
> candidates" first.
> Hopefully helpful,
I think it will be, when I get the machine to myself again.
> >>>Could it be that the *dm is interfering with gdm starting up?
> >>>Maybe it's whatever it does *after* trying its hand with the *dm'a
> >>> that is the culprit? Anyone know what that is?
> >>>Should I try making another *dm the default?
> >>>Should I try purging the other *dm's?
> >>>Should I try purging gdm?
> >>>Should I try running a general update of everything just in case?
> >>as Andre said, /etc/init.d/gdm stop.
> >>then I'd get rid of the links for the moment so you can actually work on
> >>the thing: update-rc.d gdm -f remove && update-rc.d kdm -f remove and so
> >>forth. Then you can use startx as a user and see what happens.
> >Might be easier just to do this in maintenance mode, which doesn't start
> >the things in the first place.
> >There's a point -- in the two-Debian philosophy of system maintanance,
> >use there any way of using, say, aptitude running on one system to
> >install, uninstall, configure and so forth the other?
> >It suddenly struck me as potentially useful. Doesn't the installer do
> >something like this, starting from a RAMdisk?
> >-- hendrik
> To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
> with a subject of "unsubscribe". Trouble? Contact