[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Pinpointing faulty kernel driver?



On Thu, 05 Apr 2012 22:40:15 +0200, lskovlun wrote:

> On Apr 5, 2012 16:12 "Camaleón" <noelamac@gmail.com> wrote:
> 
> 
> 
>> You can send the output to another machine using a serial cable and
> instructing the kernel to dump the message there. I had to do this years
>> ago to debug a kernel crash from a VM but to be sincere, I doubt I can
> nowadays repeat that milestone, I don't remember the steps :-)
>>
>> Look, Ubuntu has some good doc about this:
>>
>> Kernel Debugging Tricks
>> <https://wiki.ubuntu.com/Kernel/KernelDebuggingTricks>
>>
>>
>>
> Thank you for this reference! It is indeed an awesome reference. I tried
> getting netconsole to work, because I don't have a serial-to-USB cable.
> I will be looking into this problem over the next few days. Perhaps I
> will have to buy such a cable.
> 
> There is also, according to the InitramfsDebug Debian wiki page, a way
> to get log data in /dev/.initramfs/initramfs.debug. I can not make this
> work either, for some reason.
> 
> I tried the boot_delay option, but the delay seems to revert to full
> speed at a certain point in the boot sequence, so it is no use. 

To get verbose logs, remember that you have to remove "quiet" from the kernel 
line, so based in your logs below, it should be something like:

vmlinuz-3.1.0-1-amd64 root=UUID=b34744e5-2c76-44f3-a7b1-a2fed3ec430e boot_delay=1000

> There is odd behavior when the crash happens: The visible screen area seems 
> to "scroll back" so that when the hang occurs, the timestamps show
> approximately 64.xxxxxx seconds (but actually, the crash occurs at
> 68.xxxxxx seconds).
> 
> On a hunch, I tried setting acpi=off - and now the new kernel boots! 

That can be indeed relevant but I can't interpret the whole meaning nor
getting and idea of what -which acpi enables- is not of the liking to the 
updated kernel.

I would open a bug report in Debian explaing the issue and adding this 
information with the by-pass you've found. Kernel hackers will help you 
to diagnose the problem.

> Of course, this is a sub-optimal solution, so I got dmesg listings from
> both the old and the new kernel.

Have you tried with "rootdelay=9"? This was something recommended for Squeeze 
when running specific hard disk layout or hardware but it can also help here.
 
> The diff is here: http://pastebin.com/L4YXTmJh
> 
> I would be very grateful if you'd take a look.

I'll do, but that goes beyond my knowledge :-)

There are many differences between the two boot logs, that's normal but it 
seems the nouveu driver is completely missing when you boot with no ACPI, is 
that right? What VGA driver is loaded?

Additional tips you can read (and experiment with) here:

Debug: How to Isolate Linux ACPI Issues
http://www.lesswatts.org/projects/acpi/debug.php

Greetings,

-- 
Camaleón


Reply to: