[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: System freeze (Update, Close to the reason)



On Sun, Nov 26, 2006 at 05:07:40PM +0100, Hans-J. Ullrich wrote:
> Dear maintainers, 
> 
> I have sone news to the system freezing we all have a riddle with. I have some 
> news, which will hopefully help.
> 
> This is what I found out (after many many freezes)
> 
> 1. Always(!) just short before freeze the broadcom module hangs and is going 
> to be reloaded. Please remember this for my following explanation, this is 
> important.
> 
> Now take a look at my kern.log and read my explanation to it carefully. They 
> are marked with "xxx"
> 
> ----------  snip ------
> Nov 26 16:23:07 protheus2 kernel: APIC error on CPU0: 40(40)
> Nov 26 16:28:07 protheus2 kernel: APIC error on CPU0: 40(40)
> Nov 26 16:33:09 protheus2 kernel: APIC error on CPU0: 40(40)
> 
> xxx You see the APIC-errors, but they occure some times before without crash !
> 
> Nov 26 16:33:21 protheus2 kernel: NETDEV WATCHDOG: eth0: transmit timed out
> 
> xxx Look, the network is going down !!!
> 
> Nov 26 16:33:21 protheus2 kernel: bcm43xx: Controller RESET (TX timeout) ...
> Nov 26 16:33:21 protheus2 kernel: ACPI: PCI interrupt for device 0000:06:05.0 
> disabled
> Nov 26 16:33:21 protheus2 kernel: PCI: Enabling device 0000:06:05.0 (0000 -> 
> 0002)
> Nov 26 16:33:21 protheus2 kernel: ACPI: PCI Interrupt 0000:06:05.0[A] -> GSI 
> 21 (level, low) -> IRQ 169
> Nov 26 16:33:21 protheus2 kernel: bcm43xx: Chip ID 0x4318, rev 0x2
> Nov 26 16:33:21 protheus2 kernel: bcm43xx: Number of cores: 4
> Nov 26 16:33:21 protheus2 kernel: bcm43xx: Core 0: ID 0x800, rev 0xd, vendor 
> 0x4243, enabled
> Nov 26 16:33:21 protheus2 kernel: bcm43xx: Core 1: ID 0x812, rev 0x9, vendor 
> 0x4243, enabled
> Nov 26 16:33:21 protheus2 kernel: bcm43xx: Core 2: ID 0x804, rev 0xc, vendor 
> 0x4243, enabled
> Nov 26 16:33:21 protheus2 kernel: bcm43xx: Core 3: ID 0x80d, rev 0x7, vendor 
> 0x4243, enabled
> Nov 26 16:33:21 protheus2 kernel: bcm43xx: PHY connected
> Nov 26 16:33:21 protheus2 kernel: bcm43xx: Detected PHY: Version: 3, Type 2, 
> Revision 7
> Nov 26 16:33:21 protheus2 kernel: bcm43xx: Detected Radio: ID: 8205017f 
> (Manuf: 17f Ver: 2050 Rev: 8)
> Nov 26 16:33:21 protheus2 kernel: bcm43xx: Radio turned off
> Nov 26 16:33:21 protheus2 kernel: bcm43xx: Radio turned off
> Nov 26 16:33:21 protheus2 kernel: bcm43xx: Controller restarted
> 
> xxx As you can see, the module is loaded again !
> Nov 26 16:35:51 protheus2 kernel: klogd 1.4.1#20, log source = /proc/kmsg 
> started.
> 
> xxx And the last message is from the new reboot ( I had to hard reset)
> 
> --------- snap ---------------
> 
> O.k. you now might think, the freeze is produced by the broadcom module.
> This is NOT !!! I will explain why:
> 
> When I was in console, the broadcom-module was loaded again as you see above.
> Now here is the interesting thing of it: 
> 
> X was started and the machine was running. Only kdm was started, I switched to 
> X (ALT + F7) and the mouse moved on, I could ssh to it and everything else 
> went fine, too. 
> 
> So I waited some minutes, and nothing bad happend.
> 
> I switched from X to console and back to X. All went fine !
> 
> Now the important thing: I wanted to start a windowmanager. I decided NOT to 
> choose KDE (as I have kismet in the applettbar configured), but a small 
> window manager: XFCE. As soon, I started it, the machine hang at once.
> 
> I could reproduce this behaviour several times. 
> 
> What can we conclude of this ? IMO there must be together between the freezing 
> and the window-manager itself. 
> 
> This excludes the kernel itself, Xorg, the fglrx-driver, the broadcom-module 
> and many other things. Please look at all the mails to this theme: You will 
> find many descriptions, which will exclude many things and let my watchings 
> seem to be true. 

It's not clear to me how you have excluded the kernel.

I use icewm, and the framebuffer X server, and get freezes.  Presumably 
this excludes both XFCE and icewm - unless they share code or otherwise 
have the same bug.  I have the problem when I use xdcmp to log into a 
remote machine -- thus only server-side X-related stuff is running.  I 
don't have the problem when I remote log into my machine elsewhere using 
XDCMP -- then only client-side X-related stuff is running.  So 
presumably it's server-side X.  That pretty well rules out window 
managers and such.

You have the problem with fgrlx and ATI, I have it with the fb server 
and nvidia.  What is left?  What do we still have in common?

-- hendrik

> 
> I hope this message will help, to find the bug.
> 
> Best regards
> 
> Hans
> 
>   
> 
> 
> -- 
> To UNSUBSCRIBE, email to debian-amd64-REQUEST@lists.debian.org
> with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
> 



Reply to: