[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#627019: several kernel hangs before geting to login





  Friday, December 23, 2011 6:54 PM Jonathan Nieder wrote
>Hi Will,
>
>Will Set wrote:
>
>> I was able to take three pictures of the boot messages by scrolling up the
>> boot buffer from the login prompt, while booting 3.2.0-rc4-686-pae
>> to illustrate what I did my best to explain yesterday.
>>
>> I'll also attach the dmesg.udev-2 and acpidump-udev-2
>
>Thanks.
>
>If I understand you correctly, udev 175-2 segfaults at boot.

No, Not always a segfault.

Sometimes udev just hangs, leaving the machine without keyboard access.
And it's way to early in the boot process to get normal network connectivity,

Other times the kernel will panic.
And when the kernel panics I'm not able to save any data from the boot buffer
other than the screen full of data showing when the boot buffer finishes
sending the trace data to the buffer.

Boot also fails in at least one other way.
Where I can see a "udev settle" message and  messages showing the /sys directory structure.
But when this type of issue happens I am able to login and run the system console.
But, if I start the xserver under these condition I have no keyboard or mouse.

These failures have not changed much since I initially reported this.
But I have seen the failures so may times now that I'm a bit less confused by them.


> udev 175-3 does _not_ segfault,

No, udev 175-3 also segaults iirc
but I have not "re - upgraded" udev to 175-3 to test exactly what it shows, yet.

>but the boot fails in some way unless you
>add processor.nocst=1 to the kernel command line. 

Yes,
Adding processor.nocst=1 has always worked for me on all effected kernels I've tested so far.

But, the boot fails consistently when using udev 175-3  unstable with 3.2.0-rc4-686-pae
and without processor.nocst=1 added to the boot command.

>Which is already
>weird, since the only advertised changes in 175-3 were a fix to the
>systemd service file and a fix to udev rules for Xen support.  Based
>on the kernel log you sent, you are not using systemd, and I assume
>you're not using Xen.

Please understand that a failed boot, appears - at least from what I can see here,
always to have something to do with udev.

>
>This is on the machine with a D865GBF motherboard.

No,
This report is and always will be  Intel D865GRH mobo.
My other mobo is an Intel D865PERLK

There is another Debian user that has an Intel D865GBF mobo 
with a  "very" similar debian bug report filed.

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=631597
[    9.132009] Pid: 311, comm: modprobe Tainted: G      D
2.6.39-2-686-pae #1                  /D865GBF

And this user has also filed a bug report upstream after Ben requested he do so.

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=631597#15

>
>Anyway, you were able to take advantage of this situation to get an acpidump.
>
>Are these results reproducible?

Yes, But, the fail is not consistently one failure.
I had three failed boot attempts today while testing with a clean kernel commandline.
ie: processor.nocst=1 was not added to the commandline. on any of my 4 boot attempts today.
The fourth time the machine booted to a useable state.

>

I hope you can find some clues in this email that will make this issue less weird to understand.
And as always I'll do my best to get timely responses back to you, even though I have been busy
elsewhere recently.
I've not had my usual amount of time to devote to testing and learning about the kernel.

Best Regards,
Will

>Hope that helps,
>Jonathan
>





Reply to: