Bug#442877: forcedeth kernel panic
On Mon, Sep 17, 2007 at 05:49:04PM +0200, Petr Stehlik wrote:
> Package: linux-image-2.6.22-2-amd64
> Version: 2.6.22-4
>
> Freshly built new machine based on ASUS M2N32 WS Pro motherboard with
> two on-board GBit network adapters driven by the forcedeth driver and
> running 64bit Etch crashes reliably stock debian kernels 2.6.18 (Etch),
> 2.6.21 (backports.org) and also 2.6.22 (Sid backported to Etch by
> apt-get -b source). The crash occurs under high network load generated
> by tserv from dbench package within about 20 minutes of tserv run
> against this machine (which is running tserv_srv as it is to be a samba
> server).
>
> Before it crashes it fills the kernel log with the following messages
> that may or may not be related to the crash:
>
> Sep 17 14:51:27 harapes kernel: eth0: too many iterations (6) in nv_nic_irq.
> Sep 17 14:51:58 harapes last message repeated 1026 times
> Sep 17 14:52:59 harapes last message repeated 2063 times
> Sep 17 14:54:00 harapes last message repeated 2055 times
> Sep 17 14:55:01 harapes last message repeated 2044 times
>
> I wrote it may not be related because I got here an older nForce4 based
> machine that is running the tserv against the crashing server and it
> also fills the log with the same messages - but fortunately it does not
> crash...
>
> The kernel panic looks for 2.6.22-2 as follows (hand-copied from a
> screenshot made by digital camera) and is fatal - even SysRq doesn't
> work.
>
> Call Trace:
> <IRQ> :forcedeth: nv_nic_irq_optimized+0x89/0x22c
> handle_IRQ_event+0x25/0x53
> __do_softirq+0x55/0xc3
> handle_edge_irq+0xe4/0x127
> do_IRQ+0x6c/0xd5
> default_idle+0x0/0x3d
> ret_from_intr+0x0/0xa
> <EOI> default_idle+0x29/0x3d
> cpu_idle+0x8b/0xae
>
> Code: 8a 83 84 00 00 00 83 e0 f3 83 c8 04 88 83 84 00 00 00 83 7b
> RIP :forcedeth:nv_rx_process_optimized+0xe6/0x380
> Kernel panic - not syncing: Aiee, killing interrupt handler!
>
> As said this crash is reliable, I managed to kill the machine several
> times in a row. Though right now I am testing a different setup - the
> forcedeth driver loaded with "optimization_mode=1" parameter and so far
> (65 minutes of tserv run) it didn't crash...
Does this error still occur with more recent kernel versions?
Cheers,
Moritz
Reply to: