[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#501742: marked as done (linux-image-2.6.26-1-amd64: Random hangs/slowness and forcedeth problem)



Your message dated Tue, 1 Sep 2009 19:18:27 +0200
with message-id <20090901171827.GA22152@inutil.org>
and subject line Re: linux-image-2.6.26-1-amd64: Random hangs/slowness and forcedeth problem
has caused the Debian Bug report #501742,
regarding linux-image-2.6.26-1-amd64: Random hangs/slowness and forcedeth problem
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@bugs.debian.org
immediately.)


-- 
501742: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=501742
Debian Bug Tracking System
Contact owner@bugs.debian.org with problems
--- Begin Message ---
Package: linux-image-2.6.26-1-amd64
Version: 2.6.26-5
Severity: important

On versions before 2.6.26 i have been getting lots of messages like
this:

  eth0: too many iterations (6) in nv_nic_irq

Apart from filling up the log, the has been no noticable impact on the
system.

After upgrading to 2.6.26, the system started to misbehave. It would
work for  a few hours, and then it would slow down to the degree where a
simple command could take several minutes to complete.  Finally, it
would become totally unresponsive leaving the reset button as the only
option.

Browsing through the bug reports, it looked like the hpet problem, so I
tried booting with hpet=disable. With this kernel option the system
worked for an hour and then the network stopped working with this
message in the log:

  eth0: too many iterations (6) in nv_nic_irq.
  NETDEV WATCHDOG: eth0: transmit timed out
  eth0: Got tx_timeout. irq: 00000032
  eth0: Ring at 7d084000
  eth0: Dumping tx registers
  <register dump>
  eth0: Dumping tx ring
  <more dumps>
  eth0: tx_timeout: dead entries
  ------------[ cut here ]------------
  WARNING: at net/sched/sch_generic.c:222 dev_watchdog+0xa6/0xfb()
  Modules linked in: xt_limit xt_state ipt_REJECT xt_tcpudp
  ipt_MASQUERADE iptable_mangle iptable_nat nf_nat nf_conntrack_ipv4
  nf_conntrack iptable_filter ip_tables x_tables video output ac battery
  nfsd auth_rpcgss exportfs nfs lockd nfs_acl sunrpc ipv6 it87 hwmon_vid
  loop parport_pc parport snd_hda_intel pcspkr k8temp usblp snd_pcm
  snd_timer snd soundcore snd_page_alloc i2c_nforce2 i2c_core button
  evdev ext3 jbd mbcache raid1 md_mod ide_cd_mod cdrom sd_mod
  ide_pci_generic jmicron usb_storage amd74xx ide_core floppy ahci
  ohci1394 ieee1394 forcedeth ata_generic sata_nv libata scsi_mod
  ehci_hcd dock ohci_hcd thermal processor fan thermal_sys
  Pid: 0, comm: swapper Not tainted 2.6.26-1-amd64 #1
  Call Trace:
  <IRQ>  [<ffffffff80234878>] warn_on _slowpath+0x51/0x7a
  [<ffffffffa009bf69>] :forcedeth:reg_delay+0x40/0x8a
  [<ffffffffa009cb2f>] :forcedeth:nv_drain_tx+0xb4/0x186
  [<ffffffffa00a11c7>] :forcedeth:nv_tx_timeout+0x1fb/0x2a4
  [<ffffffff803cbd6a>] dev_watchdog+0x0/0xfb
  [<ffffffff803cbe10>] dev_watchdog+0xa6/0xfb
  [<ffffffff803cbd6a>] dev_watchdog+0x0/0xfb
  [<ffffffff8023c861>] run_timer_softirq+0x16a/0x1e2
  [<ffffffff80248bef>] ktime_get+0xc/0x41
  [<ffffffff8023922f>] __do_softirq+0x5c/0xd1
  [<ffffffff8020d29c>] call_softirq+0x1c/0x28
  [<ffffffff8020f37c>] do_softirq+0x3c/0x81
  [<ffffffff8023918f>] irq_exit+0x3f/0x83
  [<ffffffff8021a9eb>] smp_apic_timer_interrupt+0x8c/0xa4
  [<ffffffff8020b0a3>] default_idle+0x0/0x49
  [<ffffffff8020ccc2>] apic_timer_interrupt+0x72/0x80
  <EOI>  [<ffffffff8021a797>] lapic_next_event+0x0/0x13
  [<ffffffff8021eb20>] native_safe_halt+0x2/0x3
  [<ffffffff8021eb20>] native_safe_halt+0x2/0x3
  [<ffffffff8020b0cd>] default_idle+0x2a/0x49
  [<ffffffff8020ac79>] cpu_idle+0x89/0xb3
  ---[ end trace 314e3fb7eb127ca0 ]---

I don't know if the behavour with and without hpet=disable are symptoms
of the same problem, or if it is two different bugs.

The other network interface on this MB (Asus M2N-SLI Deluxe) also uses
forcedeth, but doesn't report any problems.

This is a production server/firewall, and I wasn't able to take any more
downtime, so when hpet=disable didn't work, I reverted to a previous
kernel (2.6.24-7). Apart from the "normal" error messages ("too many
iterations...") the system has been stable for three days now. 

-- Package-specific info:

-- System Information:
Debian Release: lenny/sid
  APT prefers testing
  APT policy: (500, 'testing')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.24-1-amd64 (SMP w/2 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash

Versions of packages linux-image-2.6.26-1-amd64 depends on:
ii  debconf [debconf-2.0]         1.5.22     Debian configuration management sy
ii  initramfs-tools [linux-initra 0.92j      tools for generating an initramfs
ii  module-init-tools             3.4-1      tools for managing Linux kernel mo

linux-image-2.6.26-1-amd64 recommends no packages.

Versions of packages linux-image-2.6.26-1-amd64 suggests:
ii  grub                          0.97-47    GRand Unified Bootloader (Legacy v
pn  linux-doc-2.6.26              <none>     (no description available)

-- debconf information:
  linux-image-2.6.26-1-amd64/postinst/create-kimage-link-2.6.26-1-amd64: true
  shared/kernel-image/really-run-bootloader: true
  linux-image-2.6.26-1-amd64/postinst/kimage-is-a-directory:
  linux-image-2.6.26-1-amd64/preinst/bootloader-initrd-2.6.26-1-amd64: true
  linux-image-2.6.26-1-amd64/postinst/old-initrd-link-2.6.26-1-amd64: true
  linux-image-2.6.26-1-amd64/preinst/initrd-2.6.26-1-amd64:
  linux-image-2.6.26-1-amd64/postinst/old-system-map-link-2.6.26-1-amd64: true
  linux-image-2.6.26-1-amd64/postinst/depmod-error-initrd-2.6.26-1-amd64: false
  linux-image-2.6.26-1-amd64/preinst/overwriting-modules-2.6.26-1-amd64: true
  linux-image-2.6.26-1-amd64/preinst/elilo-initrd-2.6.26-1-amd64: true
  linux-image-2.6.26-1-amd64/postinst/bootloader-error-2.6.26-1-amd64:
  linux-image-2.6.26-1-amd64/preinst/abort-install-2.6.26-1-amd64:
  linux-image-2.6.26-1-amd64/preinst/lilo-initrd-2.6.26-1-amd64: true
  linux-image-2.6.26-1-amd64/postinst/depmod-error-2.6.26-1-amd64: false
  linux-image-2.6.26-1-amd64/prerm/removing-running-kernel-2.6.26-1-amd64: true
  linux-image-2.6.26-1-amd64/prerm/would-invalidate-boot-loader-2.6.26-1-amd64: true
  linux-image-2.6.26-1-amd64/postinst/bootloader-test-error-2.6.26-1-amd64:
  linux-image-2.6.26-1-amd64/preinst/abort-overwrite-2.6.26-1-amd64:
  linux-image-2.6.26-1-amd64/postinst/old-dir-initrd-link-2.6.26-1-amd64: true
  linux-image-2.6.26-1-amd64/preinst/lilo-has-ramdisk:
  linux-image-2.6.26-1-amd64/preinst/failed-to-move-modules-2.6.26-1-amd64:



--- End Message ---
--- Begin Message ---
Version: 2.6.28-1

On Mon, Dec 29, 2008 at 06:01:57PM +0100, Moritz Muehlenhoff wrote:
> Per Foreby wrote:
> > Sorry, but I gave up and replaced the MB, so the test platform is no
> > longer available. The new hardware with a realtek nic is running the
> > latest 2.6.26 without any problems.
> 
> Ok, I'm leaving the bug open, in case someone else owns the hardware.

Marking as fixed in 2.6.28, if anyone has the hardware to test the
referenced patch, we can add it to a point release for Lenny.
 
Cheers,
       Moritz


--- End Message ---

Reply to: