[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: The crippled resurrection of said etch.

On Thu, Oct 26, 2006 at 10:25:37AM -0400, Matthew Krauss wrote:
> Hi,  I haven't been following this to closely so I may be missing 
> something, but this message caught my eye.  I'm not sure how experienced 
> you are, so I will try to be very explicit -- if I tell you things you 
> think are obvious, please forgive me.
> hendrik@topoi.pooq.com wrote:
> >On Wed, Oct 25, 2006 at 12:34:02PM -0700, Andrew Sackville-West wrote:
> >  
> >>hendrik@topoi.pooq.com wrote:
> >>
> >>    
> >>>Four more reboots, one successful.
> >>>It seems to ba a problem starting gdm.
> >>>      
> Hmm... It sounds like a race condition, obviously.  From what I have 
> read in this thread, I would guess that there is a very good probability 
> that you have an old startup script laying around from a package that 
> has been otherwise removed or upgraded.
> >>it could be an X problem or a gdm problem, but probably I'd guess X.
> >>    
> >>>It tell sme it's starting gdm,
> >>>then that it'snot starting kdm because it's not the default,
> >>>then that it's not starting (presumably another *dm) because it's 
> >>>not the default
> >>>      
> >>thats normal. X does sanity checks to make sure you're not starting more
> >>than one session manager or whatever.
> >>    
> >I know that.  I just thought that the last message before the crash 
> >might be a clue to what went wrong -- such as an unfortunate race 
> >condition between gdm and whatever thing decides not to start the other.
> >But I admit this is unlikely.
> >
> >  
> >>>then the black screen of death, preventing me from reading which other 
> >>>*dm it was considering.
> >>>      
> >>are you locked up hard at that point or can you switch to a vt? 
> >>ctrl-alt-fx?
> >>    
> >
> >Locked up hard.  THough I suppose I should try ssh-ing in.
> >  
> When you say "black screen of death" I assume you mean a kernel panic?  
> If so, ssh-ing won't work.  Also, notably, a kernel panic should *never* 
> happen (theoretically!) -- it is always the result of either a kernel 
> bug or a hardware failure.  No user-space program should be able to 
> cause a kernel panic.
> What I would try to isolate the problem is:
> 1. Reboot in to single user mode.
> 2. Log in as root.
> 3. Try starting X alone:
>    $ X 2>&1 | less
>    3a. If X starts, you may kill it with ctrl-alt-backspace;

X starts.
Killed it with ctrl-alt-backspace

>    3b. If X does not start, you have the output to debug;
>    3c. If you get a kernel panic, you know you have serious X problems.
> 4. Next try starting gdm directly:
>    $ /etc/init.d/gdm start
>    4a. If gdm starts, there is probably a problem in your startup scripts;
>    4b. If gdm does not start, you can check the logs under /var/log/gdm/
>    4c. If you get a kernel panic, you know you have serious gdm problems.

gdm starts.  A cursor blinks in the upper left of a black screen, then 
the cursor disappears, leaving the black screen of death.
Did a hard reset to reboot.  No trace of a log file.

try /etc/init.d/kdm start

it refuses; kdm is not default.

dpkg-reconfigure kdm
and make kdm the default.


/etc/init.d/kdm start

acts just like /etc/init.d/gdm did before -- black screen of death

reset to reboot.

This time, after the usual environment checking, it starts a maintanance 
shell with a shorter path:


As a result, lots of commands don't work.  Suppliying the path 


fails.  Its first error message reports that it cannot execute the 
'locale' command.  True enough. 'locale' is not on the path.
This looks like a but in dpkg-reconfigure -- shouldn't scripts that are 
executed as root specify their command names a little more explicitly?

Something is wrong with the maintenance shell this time -- why has its 
$PATH suddenly changed?

-- hendrik

> In the case of 4a., where you have a problem in your startup scripts:
> 5. Kill gdm -- use ctrl-alt-F1 to return to your terminal, and issue:
>    $ /etc/init.d/gdm stop
> 6. Switch to the default runlevels rc directory and ls it:
>    $ cd /etc/rc2.d
>    $ ls
>    See all the links named S##*
>        .. where ## is a number
>        .. and * is the rest of the name?
>    At startup, these are all started in the order of the ## numbers.
>    Scripts with the same number as gdm start at the same time.
>    These are good candidates for a race condition.
>    For instance, I have:
>        S99gdm
>        S99rc.local
>        S99rmnologin
>        S99stop-bootlogd
>    You probably have all of these, plus:
>       S99xdm
>       S99kdm
>    .. and others?
> 7. Try starting up the scripts with the same number as gdm in various 
> orders. Consider which ones sound likely to be the problem.  For 
> instance, you have guessed that another *dm is your problem, so try 
> starting first xdm and then gdm, then the other way around.  If you make 
> a crash, congratulations!
> Oh, to start a script, ie. S99gdm, use:
>    $ ./S99gdm start
> S99rc.local actually runs /etc/rc.local which might have anything in it, 
> so that is worth looking in to. You should probably look at 
> /etc/rc.local and see what it is doing.
> Scripts with other numbers are possible too -- just less likely -- so 
> you may want to try them if you don't find the problem in the "good 
> candidates" first.
> Hopefully helpful,
> Matthew
> >  
> >>>Could it be that the *dm is interfering with gdm starting up?
> >>>Maybe it's whatever it does *after* trying its hand with the *dm'a 
> >>>  that is the culprit?  Anyone know what that is?
> >>>Should I try making another *dm the default?
> >>>Should I try purging the other *dm's?
> >>>Should I try purging gdm?
> >>>Should I try running a general update of everything just in case?
> >>>
> >>>      
> >>as Andre said, /etc/init.d/gdm stop.
> >>
> >>then I'd get rid of the links for the moment so you can actually work on
> >>the thing: update-rc.d gdm -f remove && update-rc.d kdm -f remove and so
> >>forth. Then you can use startx as a user and see what happens.
> >>    
> >
> >Might be easier just to do this in maintenance mode, which doesn't start 
> >the things in the first place.
> >
> >There's a point -- in the two-Debian philosophy of system maintanance, 
> >use there any way of using, say, aptitude running on one system to 
> >install, uninstall, configure and so forth the other?
> >It suddenly struck me as potentially useful.  Doesn't the installer do 
> >something like this, starting from a RAMdisk?
> >
> >-- hendrik
> >
> >
> >  
> -- 
> To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org 
> with a subject of "unsubscribe". Trouble? Contact 
> listmaster@lists.debian.org

Reply to: