Bug#777683: Disabling TSO may avoid the problem
I seem to be seeing the same problem on:
3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt11-1 (2015-05-24) x86_64 GNU/Linux
with:
00:19.0 Ethernet controller: Intel Corporation 82579V Gigabit Network Connection (rev 05)
Subsystem: Intel Corporation Device 2002
Flags: bus master, fast devsel, latency 0, IRQ 44
Memory at fe600000 (32-bit, non-prefetchable) [size=128K]
Memory at fe628000 (32-bit, non-prefetchable) [size=4K]
I/O ports at f080 [size=32]
Capabilities: [c8] Power Management version 2
Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
Capabilities: [e0] PCI Advanced Features
Kernel driver in use: e1000e
The hang messages started after I rebooted into Jessie's kernel. Previously
the machine had been perfectly happy for years with Wheezy's kernel. The
machine has a second Realtek NIC that continues to work normally.
After a few days of messages like this they increased in frequency and the
network interface just stopped working altogether. After a reboot the
network interface worked again but the messages came back:
[ 291.030117] e1000e 0000:00:19.0 eth-office: Detected Hardware Unit Hang:
TDH <88>
TDT <8d>
next_to_use <8d>
next_to_clean <86>
buffer_info[next_to_clean]:
time_stamp <fffff592>
next_to_watch <88>
jiffies <fffff709>
next_to_watch.status <0>
MAC Status <40080083>
PHY Status <796d>
PHY 1000BASE-T Status <3800>
PHY Extended Status <3000>
PCI Status <10>
[ 293.030124] e1000e 0000:00:19.0 eth-office: Detected Hardware Unit Hang:
TDH <88>
TDT <8d>
next_to_use <8d>
next_to_clean <86>
buffer_info[next_to_clean]:
time_stamp <fffff592>
next_to_watch <88>
jiffies <fffff8fd>
next_to_watch.status <0>
MAC Status <40080083>
PHY Status <796d>
PHY 1000BASE-T Status <3800>
PHY Extended Status <3000>
PCI Status <10>
[ 295.030062] e1000e 0000:00:19.0 eth-office: Detected Hardware Unit Hang:
TDH <88>
TDT <8d>
next_to_use <8d>
next_to_clean <86>
buffer_info[next_to_clean]:
time_stamp <fffff592>
next_to_watch <88>
jiffies <fffffaf1>
next_to_watch.status <0>
MAC Status <40080083>
PHY Status <796d>
PHY 1000BASE-T Status <3800>
PHY Extended Status <3000>
PCI Status <10>
[ 295.041303] ------------[ cut here ]------------
[ 295.041315] WARNING: CPU: 0 PID: 0 at /build/linux-QZaPpC/linux-3.16.7-ckt11/net/sched/sch_generic.c:264 dev_watchdog+0x236/0x240()
[ 295.041317] NETDEV WATCHDOG: eth-office (e1000e): transmit queue 0 timed out
[ 295.041319] Modules linked in: nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc hid_generic usbhid hid x86_pkg_temp_thermal intel_powerclamp intel_rapl kvm_intel kvm iTCO_wdt iTCO_vendor_support ppdev evdev crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 ftdi_sio snd_hda_codec_hdmi usbserial lrw gf128mul glue_helper snd_hda_codec_realtek ablk_helper snd_hda_codec_generic psmouse i915 cryptd video snd_hda_intel drm_kms_helper drm pcspkr serio_raw parport_pc snd_hda_controller parport shpchp snd_hda_codec i2c_algo_bit snd_hwdep nuvoton_cir rc_core lpc_ich snd_pcm snd_timer mfd_core mei_me mei i2c_i801 i2c_core snd soundcore processor thermal_sys button w83627ehf hwmon_vid coretemp loop autofs4 ext4 crc16 mbcache jbd2 dm_mod raid1 md_mod sg sd_mod crc_t10dif crct10dif_generic crct10dif_pclmul
[ 295.041380] crct10dif_common crc32c_intel ahci libahci libata scsi_mod xhci_hcd ehci_pci ehci_hcd r8169 mii e1000e usbcore ptp usb_common pps_core
[ 295.041393] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.16.0-4-amd64 #1 Debian 3.16.7-ckt11-1
[ 295.041395] Hardware name: /DH67BL, BIOS BLH6710H.86A.0156.2012.0615.1908 06/15/2012
[ 295.041396] 0000000000000009 ffffffff8150b405 ffff88031f203e28 ffffffff81067797
[ 295.041399] 0000000000000000 ffff88031f203e78 0000000000000001 0000000000000000
[ 295.041402] ffff88030ec78000 ffffffff810677fc ffffffff81777fb8 ffffffff00000030
[ 295.041404] Call Trace:
[ 295.041406] <IRQ> [<ffffffff8150b405>] ? dump_stack+0x41/0x51
[ 295.041417] [<ffffffff81067797>] ? warn_slowpath_common+0x77/0x90
[ 295.041420] [<ffffffff810677fc>] ? warn_slowpath_fmt+0x4c/0x50
[ 295.041425] [<ffffffff81074777>] ? mod_timer+0x127/0x1e0
[ 295.041430] [<ffffffff8143eb96>] ? dev_watchdog+0x236/0x240
[ 295.041433] [<ffffffff8143e960>] ? dev_graft_qdisc+0x70/0x70
[ 295.041436] [<ffffffff81072ae1>] ? call_timer_fn+0x31/0x100
[ 295.041439] [<ffffffff8143e960>] ? dev_graft_qdisc+0x70/0x70
[ 295.041442] [<ffffffff81074119>] ? run_timer_softirq+0x209/0x2f0
[ 295.041445] [<ffffffff8106c641>] ? __do_softirq+0xf1/0x290
[ 295.041448] [<ffffffff8106ca15>] ? irq_exit+0x95/0xa0
[ 295.041451] [<ffffffff81514455>] ? smp_apic_timer_interrupt+0x45/0x60
[ 295.041455] [<ffffffff8151253d>] ? apic_timer_interrupt+0x6d/0x80
[ 295.041456] <EOI> [<ffffffff81074a26>] ? get_next_timer_interrupt+0x1d6/0x250
[ 295.041465] [<ffffffff813ddf9f>] ? cpuidle_enter_state+0x4f/0xc0
[ 295.041468] [<ffffffff813ddf98>] ? cpuidle_enter_state+0x48/0xc0
[ 295.041472] [<ffffffff810a7fa8>] ? cpu_startup_entry+0x2f8/0x400
[ 295.041475] [<ffffffff81903071>] ? start_kernel+0x492/0x49d
[ 295.041478] [<ffffffff81902a04>] ? set_init_arg+0x4e/0x4e
[ 295.041480] [<ffffffff81902120>] ? early_idt_handlers+0x120/0x120
[ 295.041483] [<ffffffff8190271f>] ? x86_64_start_kernel+0x14d/0x15c
[ 295.041485] ---[ end trace aaf46f7eeccba58f ]---
[ 295.041502] e1000e 0000:00:19.0 eth-office: Reset adapter unexpectedly
[ 298.763518] e1000e: eth-office NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
[ 548.999305] e1000e 0000:00:19.0 eth-office: Detected Hardware Unit Hang:
TDH <f3>
TDT <2f>
next_to_use <2f>
next_to_clean <f3>
buffer_info[next_to_clean]:
time_stamp <10000f073>
next_to_watch <f3>
jiffies <10000f2f7>
next_to_watch.status <0>
MAC Status <40080083>
PHY Status <796d>
PHY 1000BASE-T Status <3800>
PHY Extended Status <3000>
PCI Status <10>
[ 550.999203] e1000e 0000:00:19.0 eth-office: Detected Hardware Unit Hang:
TDH <f3>
TDT <2f>
next_to_use <2f>
next_to_clean <f3>
buffer_info[next_to_clean]:
time_stamp <10000f073>
next_to_watch <f3>
jiffies <10000f4eb>
next_to_watch.status <0>
MAC Status <40080083>
PHY Status <796d>
PHY 1000BASE-T Status <3800>
PHY Extended Status <3000>
PCI Status <10>
[ 552.999218] e1000e 0000:00:19.0 eth-office: Detected Hardware Unit Hang:
TDH <f3>
TDT <2f>
next_to_use <2f>
next_to_clean <f3>
buffer_info[next_to_clean]:
time_stamp <10000f073>
next_to_watch <f3>
jiffies <10000f6df>
next_to_watch.status <0>
MAC Status <40080083>
PHY Status <796d>
PHY 1000BASE-T Status <3800>
PHY Extended Status <3000>
PCI Status <10>
[ 554.010452] e1000e 0000:00:19.0 eth-office: Reset adapter unexpectedly
[ 557.732375] e1000e: eth-office NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
[ 1695.979614] e1000e 0000:00:19.0 eth-office: Detected Hardware Unit Hang:
TDH <5c>
TDT <f1>
next_to_use <f1>
next_to_clean <5c>
buffer_info[next_to_clean]:
time_stamp <1000550c4>
next_to_watch <5c>
jiffies <100055318>
next_to_watch.status <0>
MAC Status <40080083>
PHY Status <796d>
PHY 1000BASE-T Status <3800>
PHY Extended Status <3000>
PCI Status <10>
[ 1697.979546] e1000e 0000:00:19.0 eth-office: Detected Hardware Unit Hang:
TDH <5c>
TDT <f1>
next_to_use <f1>
next_to_clean <5c>
buffer_info[next_to_clean]:
time_stamp <1000550c4>
next_to_watch <5c>
jiffies <10005550c>
next_to_watch.status <0>
MAC Status <40080083>
PHY Status <796d>
PHY 1000BASE-T Status <3800>
PHY Extended Status <3000>
PCI Status <10>
[ 1699.979599] e1000e 0000:00:19.0 eth-office: Detected Hardware Unit Hang:
TDH <5c>
TDT <f1>
next_to_use <f1>
next_to_clean <5c>
buffer_info[next_to_clean]:
time_stamp <1000550c4>
next_to_watch <5c>
jiffies <100055700>
next_to_watch.status <0>
MAC Status <40080083>
PHY Status <796d>
PHY 1000BASE-T Status <3800>
PHY Extended Status <3000>
PCI Status <10>
[ 1701.979440] e1000e 0000:00:19.0 eth-office: Detected Hardware Unit Hang:
TDH <5c>
TDT <f1>
next_to_use <f1>
next_to_clean <5c>
buffer_info[next_to_clean]:
time_stamp <1000550c4>
next_to_watch <5c>
jiffies <1000558f4>
next_to_watch.status <0>
MAC Status <40080083>
PHY Status <796d>
PHY 1000BASE-T Status <3800>
PHY Extended Status <3000>
PCI Status <10>
[ 1702.986675] e1000e 0000:00:19.0 eth-office: Reset adapter unexpectedly
[ 1706.728573] e1000e: eth-office NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
[ 1810.976512] e1000e 0000:00:19.0 eth-office: Detected Hardware Unit Hang:
TDH <b2>
TDT <b9>
next_to_use <b9>
next_to_clean <b0>
buffer_info[next_to_clean]:
time_stamp <10005c0be>
next_to_watch <b2>
jiffies <10005c366>
next_to_watch.status <0>
MAC Status <40080083>
PHY Status <796d>
PHY 1000BASE-T Status <3800>
PHY Extended Status <3000>
PCI Status <10>
[ 1812.976588] e1000e 0000:00:19.0 eth-office: Detected Hardware Unit Hang:
TDH <b2>
TDT <b9>
next_to_use <b9>
next_to_clean <b0>
buffer_info[next_to_clean]:
time_stamp <10005c0be>
next_to_watch <b2>
jiffies <10005c55a>
next_to_watch.status <0>
MAC Status <40080083>
PHY Status <796d>
PHY 1000BASE-T Status <3800>
PHY Extended Status <3000>
PCI Status <10>
[ 1814.976378] e1000e 0000:00:19.0 eth-office: Detected Hardware Unit Hang:
TDH <b2>
TDT <b9>
next_to_use <b9>
next_to_clean <b0>
buffer_info[next_to_clean]:
time_stamp <10005c0be>
next_to_watch <b2>
jiffies <10005c74e>
next_to_watch.status <0>
MAC Status <40080083>
PHY Status <796d>
PHY 1000BASE-T Status <3800>
PHY Extended Status <3000>
PCI Status <10>
[ 1816.976471] e1000e 0000:00:19.0 eth-office: Detected Hardware Unit Hang:
TDH <b2>
TDT <b9>
next_to_use <b9>
next_to_clean <b0>
buffer_info[next_to_clean]:
time_stamp <10005c0be>
next_to_watch <b2>
jiffies <10005c942>
next_to_watch.status <0>
MAC Status <40080083>
PHY Status <796d>
PHY 1000BASE-T Status <3800>
PHY Extended Status <3000>
PCI Status <10>
[ 1816.986750] e1000e 0000:00:19.0 eth-office: Reset adapter unexpectedly
[ 1820.769572] e1000e: eth-office NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
At that point as a stab in the dark I ran:
ethtool -K eth-office tso off
and the network has been reliable and no such messages have appeared since
(about 24 hours.)
Mike.
Reply to: