[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#941450: linux-image-4.19.0-6-amd64: e1000e driver periodically resets card



Package: linux-image-4.19.0-6-amd64
Version: 4.19.67-2+deb10u1

This issue looks like it might be related to #657689, or at least the
symptoms are similar.

Periodically, my network link goes down for ~35 seconds at a time.  I
have not been able to determine what traffic pattern or set of other set
of circumstances that causes this and it doesn't happen on a regular
schedule.  It could happen dozens of times an hour, then not happen for
many hours.

Here is the most recent sample of the error from my kernel log:

> [1816352.315998] e1000e 0000:00:1f.6 enp0s31f6: Detected Hardware Unit Hang:
>                    TDH                  <f>
>                    TDT                  <47>
>                    next_to_use          <47>
>                    next_to_clean        <f>
>                  buffer_info[next_to_clean]:
>                    time_stamp           <11b0fc627>
>                    next_to_watch        <10>
>                    jiffies              <11b0fc789>
>                    next_to_watch.status <0>
>                  MAC Status             <80083>
>                  PHY Status             <796d>
>                  PHY 1000BASE-T Status  <7800>
>                  PHY Extended Status    <3000>
>                  PCI Status             <10>
> [1816354.293145] e1000e 0000:00:1f.6 enp0s31f6: Detected Hardware Unit Hang:
>                    TDH                  <f>
>                    TDT                  <47>
>                    next_to_use          <47>
>                    next_to_clean        <f>
>                  buffer_info[next_to_clean]:
>                    time_stamp           <11b0fc627>
>                    next_to_watch        <10>
>                    jiffies              <11b0fc978>
>                    next_to_watch.status <0>
>                  MAC Status             <80083>
>                  PHY Status             <796d>
>                  PHY 1000BASE-T Status  <7800>
>                  PHY Extended Status    <3000>
>                  PCI Status             <10>
> [1816356.313154] e1000e 0000:00:1f.6 enp0s31f6: Detected Hardware Unit Hang:
>                    TDH                  <f>
>                    TDT                  <47>
>                    next_to_use          <47>
>                    next_to_clean        <f>
>                  buffer_info[next_to_clean]:
>                    time_stamp           <11b0fc627>
>                    next_to_watch        <10>
>                    jiffies              <11b0fcb71>
>                    next_to_watch.status <0>
>                  MAC Status             <80083>
>                  PHY Status             <796d>
>                  PHY 1000BASE-T Status  <7800>
>                  PHY Extended Status    <3000>
>                  PCI Status             <10>
> [1816358.293101] e1000e 0000:00:1f.6 enp0s31f6: Detected Hardware Unit Hang:
>                    TDH                  <f>
>                    TDT                  <47>
>                    next_to_use          <47>
>                    next_to_clean        <f>
>                  buffer_info[next_to_clean]:
>                    time_stamp           <11b0fc627>
>                    next_to_watch        <10>
>                    jiffies              <11b0fcd60>
>                    next_to_watch.status <0>
>                  MAC Status             <80083>
>                  PHY Status             <796d>
>                  PHY 1000BASE-T Status  <7800>
>                  PHY Extended Status    <3000>
>                  PCI Status             <10>
> [1816358.420838] e1000e 0000:00:1f.6 enp0s31f6: Reset adapter unexpectedly
> [1816358.421364] br0: port 1(enp0s31f6) entered disabled state
> [1816362.315470] e1000e: enp0s31f6 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
> [1816362.315560] br0: port 1(enp0s31f6) entered blocking state
> [1816362.315565] br0: port 1(enp0s31f6) entered listening state
> [1816377.364603] br0: port 1(enp0s31f6) entered learning state
> [1816392.468385] br0: port 1(enp0s31f6) entered forwarding state
> [1816392.468388] br0: topology change detected, propagating

I did not really notice this issue until I placed the interface into a
bridge (which it shares with qemu-kvm virtual interfaces to bridge them
to the physical network).  Placing it into the bridge adds 30 seconds of
bridge learning/listening time that exacerbates the issue and makes it
much easier to notice.

To clarify, this driver issue causes the physical interface to be down
for ~5 seconds only.  However, during that time the physical link
appears down to the bridge, which then adds another 30 seconds of
downtime after the physical link is restored.

Additional information:

> $ uname -a
> Linux chowie-desktop 4.19.0-6-amd64 #1 SMP Debian 4.19.67-2 (2019-08-28) x86_64 GNU/Linux

> $ dpkg -l linux-image-\* | grep ^i
> ii  linux-image-4.19.0-5-amd64          4.19.37-5+deb10u2 amd64        Linux 4.19 for 64-bit PCs (signed)
> ii  linux-image-4.19.0-6-amd64          4.19.67-2+deb10u1 amd64        Linux 4.19 for 64-bit PCs (signed)
> ii  linux-image-amd64                   4.19+105+deb10u1  amd64        Linux for 64-bit PCs (meta-package)

lspci -vv output for the card:

> 00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (2) I219-V (rev 31)
> 	Subsystem: Gigabyte Technology Co., Ltd Ethernet Connection (2) I219-V
> 	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
> 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> 	Latency: 0
> 	Interrupt: pin A routed to IRQ 122
> 	Region 0: Memory at df200000 (32-bit, non-prefetchable) [size=128K]
> 	Capabilities: [c8] Power Management version 3
> 		Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
> 		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME-
> 	Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
> 		Address: 00000000fee08004  Data: 4023
> 	Capabilities: [e0] PCI Advanced Features
> 		AFCap: TP+ FLR+
> 		AFCtrl: FLR-
> 		AFStatus: TP-
> 	Kernel driver in use: e1000e
> 	Kernel modules: e1000e

-- 
Chris Howie
http://www.chrishowie.com
http://en.wikipedia.org/wiki/User:Crazycomputers

If you correspond with me on a regular basis, please read this document:
http://www.chrishowie.com/email-preferences/

PGP fingerprint: 2B7A B280 8B12 21CC 260A DF65 6FCE 505A CF83 38F5

------------------------------------------------------------------------
                    IMPORTANT INFORMATION/DISCLAIMER

This document should be read only by those persons to whom it is
addressed.  If you have received this message it was obviously addressed
to you and therefore you can read it.

Additionally, by sending an email to ANY of my addresses or to ANY
mailing lists to which I am subscribed, whether intentionally or
accidentally, you are agreeing that I am "the intended recipient," and
that I may do whatever I wish with the contents of any message received
from you, unless a pre-existing agreement prohibits me from so doing.

This overrides any disclaimer or statement of confidentiality that may
be included on your message.


Reply to: