Re: System freeze (Update, Close to the reason)
On Sun, Nov 26, 2006 at 05:07:40PM +0100, Hans-J. Ullrich wrote:
> Dear maintainers,
>
> I have sone news to the system freezing we all have a riddle with. I have some
> news, which will hopefully help.
>
> This is what I found out (after many many freezes)
>
> 1. Always(!) just short before freeze the broadcom module hangs and is going
> to be reloaded. Please remember this for my following explanation, this is
> important.
>
> Now take a look at my kern.log and read my explanation to it carefully. They
> are marked with "xxx"
>
> ---------- snip ------
> Nov 26 16:23:07 protheus2 kernel: APIC error on CPU0: 40(40)
> Nov 26 16:28:07 protheus2 kernel: APIC error on CPU0: 40(40)
> Nov 26 16:33:09 protheus2 kernel: APIC error on CPU0: 40(40)
>
> xxx You see the APIC-errors, but they occure some times before without crash !
>
> Nov 26 16:33:21 protheus2 kernel: NETDEV WATCHDOG: eth0: transmit timed out
>
> xxx Look, the network is going down !!!
>
> Nov 26 16:33:21 protheus2 kernel: bcm43xx: Controller RESET (TX timeout) ...
> Nov 26 16:33:21 protheus2 kernel: ACPI: PCI interrupt for device 0000:06:05.0
> disabled
> Nov 26 16:33:21 protheus2 kernel: PCI: Enabling device 0000:06:05.0 (0000 ->
> 0002)
> Nov 26 16:33:21 protheus2 kernel: ACPI: PCI Interrupt 0000:06:05.0[A] -> GSI
> 21 (level, low) -> IRQ 169
> Nov 26 16:33:21 protheus2 kernel: bcm43xx: Chip ID 0x4318, rev 0x2
> Nov 26 16:33:21 protheus2 kernel: bcm43xx: Number of cores: 4
> Nov 26 16:33:21 protheus2 kernel: bcm43xx: Core 0: ID 0x800, rev 0xd, vendor
> 0x4243, enabled
> Nov 26 16:33:21 protheus2 kernel: bcm43xx: Core 1: ID 0x812, rev 0x9, vendor
> 0x4243, enabled
> Nov 26 16:33:21 protheus2 kernel: bcm43xx: Core 2: ID 0x804, rev 0xc, vendor
> 0x4243, enabled
> Nov 26 16:33:21 protheus2 kernel: bcm43xx: Core 3: ID 0x80d, rev 0x7, vendor
> 0x4243, enabled
> Nov 26 16:33:21 protheus2 kernel: bcm43xx: PHY connected
> Nov 26 16:33:21 protheus2 kernel: bcm43xx: Detected PHY: Version: 3, Type 2,
> Revision 7
> Nov 26 16:33:21 protheus2 kernel: bcm43xx: Detected Radio: ID: 8205017f
> (Manuf: 17f Ver: 2050 Rev: 8)
> Nov 26 16:33:21 protheus2 kernel: bcm43xx: Radio turned off
> Nov 26 16:33:21 protheus2 kernel: bcm43xx: Radio turned off
> Nov 26 16:33:21 protheus2 kernel: bcm43xx: Controller restarted
>
> xxx As you can see, the module is loaded again !
> Nov 26 16:35:51 protheus2 kernel: klogd 1.4.1#20, log source = /proc/kmsg
> started.
>
> xxx And the last message is from the new reboot ( I had to hard reset)
>
> --------- snap ---------------
>
> O.k. you now might think, the freeze is produced by the broadcom module.
> This is NOT !!! I will explain why:
>
> When I was in console, the broadcom-module was loaded again as you see above.
> Now here is the interesting thing of it:
>
> X was started and the machine was running. Only kdm was started, I switched to
> X (ALT + F7) and the mouse moved on, I could ssh to it and everything else
> went fine, too.
>
> So I waited some minutes, and nothing bad happend.
>
> I switched from X to console and back to X. All went fine !
>
> Now the important thing: I wanted to start a windowmanager. I decided NOT to
> choose KDE (as I have kismet in the applettbar configured), but a small
> window manager: XFCE. As soon, I started it, the machine hang at once.
>
> I could reproduce this behaviour several times.
>
> What can we conclude of this ? IMO there must be together between the freezing
> and the window-manager itself.
>
> This excludes the kernel itself, Xorg, the fglrx-driver, the broadcom-module
> and many other things. Please look at all the mails to this theme: You will
> find many descriptions, which will exclude many things and let my watchings
> seem to be true.
It's not clear to me how you have excluded the kernel.
I use icewm, and the framebuffer X server, and get freezes. Presumably
this excludes both XFCE and icewm - unless they share code or otherwise
have the same bug. I have the problem when I use xdcmp to log into a
remote machine -- thus only server-side X-related stuff is running. I
don't have the problem when I remote log into my machine elsewhere using
XDCMP -- then only client-side X-related stuff is running. So
presumably it's server-side X. That pretty well rules out window
managers and such.
You have the problem with fgrlx and ATI, I have it with the fb server
and nvidia. What is left? What do we still have in common?
-- hendrik
>
> I hope this message will help, to find the bug.
>
> Best regards
>
> Hans
>
>
>
>
> --
> To UNSUBSCRIBE, email to debian-amd64-REQUEST@lists.debian.org
> with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
>
Reply to: