[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#777683: Network hang and data corruption with e1000e driver



Control: retitle -1 linux-image-3.16: Network hang and data corruption with e1000e driver
Control: found -1 3.16.7-ckt11-1+deb8u2

After upgrading the original system to jessie the problem still
appears when I boot kernel 3.16 from jessie, but not when I boot
kernel 3.2 from wheezy.
The network hangs sporadically and files which were transferred
while the bug occurred contain large blocks of corrupted data
(several kB). The logs from jessie are attached below.

On Jun 30, Mike Crowe wrote:
> At that point as a stab in the dark I ran:
>                                                                 
>  ethtool -K eth-office tso off
>                                                                 
> and the network has been reliable and no such messages have
> appeared since (about 24 hours.)

Thanks for sharing, Mike! I'll try that.

Regards

Uwe


00:19.0 Ethernet controller [0200]: Intel Corporation 82566DM Gigabit Network Connection [8086:104a] (rev 02)
        Subsystem: Hewlett-Packard Company Device [103c:2800]
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0
        Interrupt: pin B routed to IRQ 41
        Region 0: Memory at f0500000 (32-bit, non-prefetchable) [size=128K]
        Region 1: Memory at f0525000 (32-bit, non-prefetchable) [size=4K]
        Region 2: I/O ports at 2000 [size=32]
        Capabilities: [c8] Power Management version 2
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
        Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
                Address: 00000000fee0300c  Data: 4152
        Kernel driver in use: e1000e


Aug 04 06:10:30 c1 kernel: e1000e 0000:00:19.0 lan: Detected Hardware Unit Hang:
                             TDH                  <31>
                             TDT                  <5b>
                             next_to_use          <5b>
                             next_to_clean        <31>
                           buffer_info[next_to_clean]:
                             time_stamp           <17cb9f3>
                             next_to_watch        <36>
                             jiffies              <17cbcc1>
                             next_to_watch.status <0>
                           MAC Status             <802a3>
                           PHY Status             <792d>
                           PHY 1000BASE-T Status  <3800>
                           PHY Extended Status    <3000>
                           PCI Status             <10>
Aug 04 06:10:32 c1 kernel: e1000e 0000:00:19.0 lan: Detected Hardware Unit Hang:
                             TDH                  <31>
                             TDT                  <5b>
                             next_to_use          <5b>
                             next_to_clean        <31>
                           buffer_info[next_to_clean]:
                             time_stamp           <17cb9f3>
                             next_to_watch        <36>
                             jiffies              <17cbeb5>
                             next_to_watch.status <0>
                           MAC Status             <802a3>
                           PHY Status             <792d>
                           PHY 1000BASE-T Status  <3800>
                           PHY Extended Status    <3000>
                           PCI Status             <10>
Aug 04 06:10:33 c1 kernel: ------------[ cut here ]------------
Aug 04 06:10:33 c1 kernel: WARNING: CPU: 0 PID: 0 at /build/linux-M5bqDz/linux-3.16.7-ckt11/net/sched/sch_generic.c:264 dev_watchdog+0x1e8/0x200()
Aug 04 06:10:33 c1 kernel: NETDEV WATCHDOG: lan (e1000e): transmit queue 0 timed out
Aug 04 06:10:33 c1 kernel: Modules linked in: tcp_diag inet_diag authenc xfrm4_mode_transport xfrm6_mode_tunnel xfrm4_mode_tunnel binfmt_misc xt_recent xt_TCPMSS xt_multiport xt_policy xt_nat xt_conntrack xt_tcpudp ipt_REJECT xt_LOG iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conn
Aug 04 06:10:33 c1 kernel:  hmac crypto_null af_key xfrm_algo ses enclosure usb_storage iTCO_wdt iTCO_vendor_support kvm_intel kvm hp_wmi ppdev sparse_keymap rfkill snd_hda_codec_realtek snd_hda_codec_generic sata_sil tg3 psmouse libphy evdev snd_hda_intel snd_hda_controller snd_hda_codec sg snd_hwdep serio_raw snd_pc
Aug 04 06:10:33 c1 kernel: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.16.0-4-686-pae #1 Debian 3.16.7-ckt11-1+deb8u2
Aug 04 06:10:33 c1 kernel: Hardware name: Hewlett-Packard HP Compaq dc7700p Convertible Minitower/0A58h, BIOS 786E1 v01.15 08/05/2008
Aug 04 06:10:33 c1 kernel:  c15adc04 f4809ef4 c14773af f4809f04 c1056c74 c15adbc8 f4809f20 00000000
Aug 04 06:10:33 c1 kernel:  c15adc04 00000108 c13bd268 c13bd268 00000009 f34a8000 fe834532 ffffff25
Aug 04 06:10:33 c1 kernel:  f4809f0c c1056cc3 00000009 f4809f04 c15adbc8 f4809f20 f4809f40 c13bd268
Aug 04 06:10:33 c1 kernel: Call Trace:
Aug 04 06:10:33 c1 kernel:  [<c14773af>] ? dump_stack+0x3e/0x4e
Aug 04 06:10:33 c1 kernel:  [<c1056c74>] ? warn_slowpath_common+0x84/0xa0
Aug 04 06:10:33 c1 kernel:  [<c13bd268>] ? dev_watchdog+0x1e8/0x200
Aug 04 06:10:33 c1 kernel:  [<c13bd268>] ? dev_watchdog+0x1e8/0x200
Aug 04 06:10:33 c1 kernel:  [<c1056cc3>] ? warn_slowpath_fmt+0x33/0x40
Aug 04 06:10:33 c1 kernel:  [<c13bd268>] ? dev_watchdog+0x1e8/0x200
Aug 04 06:10:33 c1 kernel:  [<c1061120>] ? call_timer_fn+0x30/0xf0
Aug 04 06:10:33 c1 kernel:  [<c108b9be>] ? rebalance_domains+0x14e/0x250
Aug 04 06:10:33 c1 kernel:  [<c13bd080>] ? dev_graft_qdisc+0x70/0x70
Aug 04 06:10:33 c1 kernel:  [<c1062476>] ? run_timer_softirq+0x176/0x240
Aug 04 06:10:33 c1 kernel:  [<c13bd080>] ? dev_graft_qdisc+0x70/0x70
Aug 04 06:10:33 c1 kernel:  [<c105b623>] ? __do_softirq+0xc3/0x230
Aug 04 06:10:33 c1 kernel:  [<c147d2e7>] ? nmi_stack_correct+0x2f/0x34
Aug 04 06:10:33 c1 kernel:  [<c105b560>] ? cpu_callback+0x160/0x160
Aug 04 06:10:33 c1 kernel:  [<c105b560>] ? cpu_callback+0x160/0x160
Aug 04 06:10:33 c1 kernel:  [<c1011652>] ? do_softirq_own_stack+0x22/0x30
Aug 04 06:10:33 c1 kernel:  <IRQ>  [<c105b9ad>] ? irq_exit+0x8d/0xa0
Aug 04 06:10:33 c1 kernel:  [<c147d508>] ? smp_apic_timer_interrupt+0x38/0x50
Aug 04 06:10:33 c1 kernel:  [<c147cc3c>] ? apic_timer_interrupt+0x34/0x3c
Aug 04 06:10:33 c1 kernel:  [<c1018142>] ? mwait_idle+0x62/0xa0
Aug 04 06:10:33 c1 kernel:  [<c101893e>] ? arch_cpu_idle+0xe/0x10
Aug 04 06:10:33 c1 kernel:  [<c1090bf3>] ? cpu_startup_entry+0x303/0x3b0
Aug 04 06:10:33 c1 kernel:  [<c166dbf4>] ? start_kernel+0x3d8/0x3dd
Aug 04 06:10:33 c1 kernel:  [<c166d625>] ? set_init_arg+0x45/0x45
Aug 04 06:10:33 c1 kernel: ---[ end trace 05653a3e75849f99 ]---
Aug 04 06:10:33 c1 kernel: e1000e 0000:00:19.0 lan: Reset adapter unexpectedly
Aug 04 06:10:36 c1 kernel: e1000e: lan NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Aug 04 06:10:44 c1 kernel: e1000e 0000:00:19.0 lan: Detected Hardware Unit Hang:
                             TDH                  <4>
                             TDT                  <25>
                             next_to_use          <25>
                             next_to_clean        <4>
                           buffer_info[next_to_clean]:
                             time_stamp           <17cc7d9>
                             next_to_watch        <9>
                             jiffies              <17cca6d>
                             next_to_watch.status <0>
                           MAC Status             <802a3>
                           PHY Status             <792d>
                           PHY 1000BASE-T Status  <3800>
                           PHY Extended Status    <3000>
                           PCI Status             <10>
Aug 04 06:10:46 c1 kernel: e1000e 0000:00:19.0 lan: Detected Hardware Unit Hang:
                             TDH                  <4>
                             TDT                  <25>
                             next_to_use          <25>
                             next_to_clean        <4>
                           buffer_info[next_to_clean]:
                             time_stamp           <17cc7d9>
                             next_to_watch        <9>
                             jiffies              <17ccc61>
                             next_to_watch.status <0>
                           MAC Status             <802a3>
                           PHY Status             <792d>
                           PHY 1000BASE-T Status  <3800>
                           PHY Extended Status    <3000>
                           PCI Status             <10>
Aug 04 06:10:48 c1 kernel: e1000e 0000:00:19.0 lan: Detected Hardware Unit Hang:
                             TDH                  <4>
                             TDT                  <25>
                             next_to_use          <25>
                             next_to_clean        <4>
                           buffer_info[next_to_clean]:
                             time_stamp           <17cc7d9>
                             next_to_watch        <9>
                             jiffies              <17cce55>
                             next_to_watch.status <0>
                           MAC Status             <802a3>
                           PHY Status             <792d>
                           PHY 1000BASE-T Status  <3800>
                           PHY Extended Status    <3000>
                           PCI Status             <10>
Aug 04 06:10:50 c1 kernel: e1000e 0000:00:19.0 lan: Detected Hardware Unit Hang:
                             TDH                  <4>
                             TDT                  <25>
                             next_to_use          <25>
                             next_to_clean        <4>
                           buffer_info[next_to_clean]:
                             time_stamp           <17cc7d9>
                             next_to_watch        <9>
                             jiffies              <17cd049>
                             next_to_watch.status <0>
                           MAC Status             <802a3>
                           PHY Status             <792d>
                           PHY 1000BASE-T Status  <3800>
                           PHY Extended Status    <3000>
                           PCI Status             <10>
Aug 04 06:10:51 c1 kernel: e1000e 0000:00:19.0 lan: Reset adapter unexpectedly
Aug 04 06:10:54 c1 kernel: e1000e: lan NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Aug 04 06:10:58 c1 kernel: e1000e 0000:00:19.0 lan: Detected Hardware Unit Hang:
...


Reply to: