[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#945912: marked as done (Kernel 5.3 e100e Detected Hardware Unit Hang)



Your message dated Sun, 27 Nov 2022 17:13:10 +0100 (CET)
with message-id <20221127161310.D8F8EBE2DE0@eldamar.lan>
and subject line Closing this bug (BTS maintenance for src:linux bugs)
has caused the Debian Bug report #945912,
regarding Kernel 5.3 e100e Detected Hardware Unit Hang
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@bugs.debian.org
immediately.)


-- 
945912: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=945912
Debian Bug Tracking System
Contact owner@bugs.debian.org with problems
--- Begin Message ---

Package: linux-image-5.3.0-0.bpo.2-amd64
Severity: important

Dear Maintainer,

   * What led up to the situation?

 

Installed kernel 5.2 and 5.3 on two physical hosts in a KVM virtual cluster, each host with two bonded Ethernet ports. After some random amount of time that ranges from right after booting to several hours later, the e1000e driver hangs and all heck breaks loose with kernel errors. This has happened on both hosts.


   * What exactly did you do (or not do) that was effective (or
     ineffective)?

 

This problem does not occur with kernel 4.19. I reverted to kernel 4.19.


   * What was the outcome of this action?

 

Using kernel 4.19 fixes the e1000e hang problem.


   * What outcome did you expect instead?

 

Networking would work perfectly in kernels 5.2 and 5.3, just like it does in 4.19.

 

The hang occurs with eth1 and the e1000e driver, which involves the Intel I210 Gb adapter.


# lspci | grep I2

00:19.0 Ethernet controller: Intel Corporation Ethernet Connection I217-LM (rev 05)
02:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network Connection (rev 03)

 

# Lines from the journal during boot

 

Nov 30 04:47:57 vhost002 kernel: e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
Nov 30 04:47:57 vhost002 kernel: e1000e: Copyright(c) 1999 - 2015 Intel Corporation.

Nov 30 04:47:57 vhost002 kernel: e1000e 0000:00:19.0: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode

Nov 30 04:47:57 vhost002 kernel: igb: Intel(R) Gigabit Ethernet Network Driver - version 5.6.0-k
Nov 30 04:47:57 vhost002 kernel: igb: Copyright (c) 2007-2014 Intel Corporation.

Nov 30 04:47:57 vhost002 kernel: igb 0000:02:00.0: PHY reset is blocked due to SOL/IDER session.

Nov 30 04:47:57 vhost002 kernel: igb 0000:02:00.0: added PHC on eth0
Nov 30 04:47:57 vhost002 kernel: igb 0000:02:00.0: Intel(R) Gigabit Ethernet Network Connection
Nov 30 04:47:57 vhost002 kernel: igb 0000:02:00.0: eth0: (PCIe:2.5Gb/s:Width x1) d0:50:99:c0:38:b6
Nov 30 04:47:57 vhost002 kernel: igb 0000:02:00.0: eth0: PBA No: 001300-000
Nov 30 04:47:57 vhost002 kernel: igb 0000:02:00.0: Using MSI-X interrupts. 4 rx queue(s), 4 tx queue(s)

Nov 30 04:47:57 vhost002 kernel: e1000e 0000:00:19.0 0000:00:19.0 (uninitialized): registered PHC clock

Nov 30 04:47:57 vhost002 kernel: e1000e 0000:00:19.0 eth1: (PCI Express:2.5GT/s:Width x1) d0:50:99:c0:38:b7
Nov 30 04:47:57 vhost002 kernel: e1000e 0000:00:19.0 eth1: Intel(R) PRO/1000 Network Connection
Nov 30 04:47:57 vhost002 kernel: e1000e 0000:00:19.0 eth1: MAC: 11, PHY: 12, PBA No: FFFFFF-0FF

Nov 30 04:47:58 vhost002 kernel: Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
Nov 30 04:47:58 vhost002 kernel: bonding: bond0 is being created...
Nov 30 04:47:58 vhost002 systemd-udevd[387]: Could not generate persistent MAC address for bond0: No such file or directory

Nov 30 04:47:59 vhost002 kernel: igb 0000:02:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
Nov 30 04:47:59 vhost002 kernel: bond0: (slave eth0): Enslaving as a backup interface with an up link

Nov 30 04:47:59 vhost002 kernel: bond0: (slave eth1): Enslaving as a backup interface with a down link
Nov 30 04:47:59 vhost002 kernel: bond0: Warning: No 802.3ad response from the link partner for any adapters in the bond
Nov 30 04:47:59 vhost002 kernel: bond0: (slave eth0): link status definitely up, 1000 Mbps full duplex
Nov 30 04:47:59 vhost002 kernel: bond0: active interface up!

Nov 30 04:47:59 vhost002 systemd-udevd[445]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Nov 30 04:47:59 vhost002 systemd-udevd[445]: Could not generate persistent MAC address for kvmbr0: No such file or directory
Nov 30 04:47:59 vhost002 ifup[781]: Waiting for a max of 0 seconds for bond0 to become available.
Nov 30 04:47:59 vhost002 kernel: bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
Nov 30 04:47:59 vhost002 kernel: kvmbr0: port 1(bond0) entered blocking state
Nov 30 04:47:59 vhost002 kernel: kvmbr0: port 1(bond0) entered disabled state
Nov 30 04:47:59 vhost002 kernel: device bond0 entered promiscuous mode
Nov 30 04:47:59 vhost002 kernel: device eth0 entered promiscuous mode
Nov 30 04:47:59 vhost002 kernel: device eth1 entered promiscuous mode
Nov 30 04:47:59 vhost002 kernel: kvmbr0: port 1(bond0) entered blocking state
Nov 30 04:47:59 vhost002 kernel: kvmbr0: port 1(bond0) entered forwarding state
Nov 30 04:47:59 vhost002 ifup[781]: Waiting for kvmbr0 to get ready (MAXWAIT is 2 seconds).
Nov 30 04:47:59 vhost002 avahi-daemon[711]: Joining mDNS multicast group on interface kvmbr0.IPv4 with address 192.168.0.237.
Nov 30 04:47:59 vhost002 avahi-daemon[711]: New relevant interface kvmbr0.IPv4 for mDNS.
Nov 30 04:47:59 vhost002 avahi-daemon[711]: Registering new address record for 192.168.0.237 on kvmbr0.IPv4.
Nov 30 04:47:59 vhost002 avahi-daemon[711]: Registering new address record for 192.168.0.249 on kvmbr0.IPv4.
Nov 30 04:48:01 vhost002 avahi-daemon[711]: Joining mDNS multicast group on interface kvmbr0.IPv6 with address fe80::d250:99ff:fec0:38b6.
Nov 30 04:48:01 vhost002 avahi-daemon[711]: New relevant interface kvmbr0.IPv6 for mDNS.
Nov 30 04:48:01 vhost002 avahi-daemon[711]: Registering new address record for fe80::d250:99ff:fec0:38b6 on kvmbr0.*.
Nov 30 04:48:02 vhost002 systemd[1]: Started Raise network interfaces.
Nov 30 04:48:02 vhost002 systemd[1]: Reached target Network.

Nov 30 04:48:02 vhost002 systemd[1]: Reached target Network is Online.

 

Nov 30 04:52:03 vhost002 kernel: e1000e 0000:00:19.0 eth1: Detected Hardware Unit Hang:
                                   TDH                  <5f>
                                   TDT                  <74>
                                   next_to_use          <74>
                                   next_to_clean        <5e>
                                 buffer_info[next_to_clean]:
                                   time_stamp           <ffffcc73>
                                   next_to_watch        <5f>
                                   jiffies              <ffffcd80>
                                   next_to_watch.status <0>
                                 MAC Status             <40080083>
                                 PHY Status             <796d>
                                 PHY 1000BASE-T Status  <3800>
                                 PHY Extended Status    <3000>
                                 PCI Status             <10>

Nov 30 04:52:05 vhost002 kernel: e1000e 0000:00:19.0 eth1: Detected Hardware Unit Hang:
                                   TDH                  <5f>
                                   TDT                  <74>
                                   next_to_use          <74>
                                   next_to_clean        <5e>
                                 buffer_info[next_to_clean]:
                                   time_stamp           <ffffcc73>
                                   next_to_watch        <5f>
                                   jiffies              <ffffcf78>
                                   next_to_watch.status <0>
                                 MAC Status             <40080083>
                                 PHY Status             <796d>
                                 PHY 1000BASE-T Status  <3800>
                                 PHY Extended Status    <3000>
                                 PCI Status             <10>

Nov 30 04:52:07 vhost002 kernel: e1000e 0000:00:19.0 eth1: Detected Hardware Unit Hang:
                                   TDH                  <5f>
                                   TDT                  <74>
                                   next_to_use          <74>
                                   next_to_clean        <5e>
                                 buffer_info[next_to_clean]:
                                   time_stamp           <ffffcc73>
                                   next_to_watch        <5f>
                                   jiffies              <ffffd170>
                                   next_to_watch.status <0>
                                 MAC Status             <40080083>
                                 PHY Status             <796d>
                                 PHY 1000BASE-T Status  <3800>
                                 PHY Extended Status    <3000>
                                 PCI Status             <10>
Nov 30 04:52:07 vhost002 kernel: ------------[ cut here ]------------
Nov 30 04:52:07 vhost002 kernel: NETDEV WATCHDOG: eth1 (e1000e): transmit queue 0 timed out
Nov 30 04:52:07 vhost002 kernel: WARNING: CPU: 5 PID: 0 at net/sched/sch_generic.c:448 dev_watchdog+0x253/0x260
Nov 30 04:52:07 vhost002 kernel: Modules linked in: vhost_net vhost tap tun ocfs2 quota_tree ocfs2_dlmfs ocfs2_stack_o2cb veth ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue sctp configfs nf_tables nfnetlink bridge stp llc bonding fuse intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel ipmi_ssif kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel btrfs aesni_intel aes_x86_64 crypto_simd zstd_compress xhci_pci iTCO_wdt cryptd glue_helper zstd_decompress xhci_hcd ehci_pci iTCO_vendor_support intel_cstate igb e1000e ehci_hcd watchdog intel_uncore usbcore ptp mei_me lpc_ich intel_rapl_perf pcspkr sg mei dca usb_common i2c_i801 pps_core mfd_core ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter acpi_pad evdev drbd sunrpc lru_cache ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 raid10 raid456 libcrc32c crc32c_generic async_raid6_recov async_memcpy async_pq async_xor xor async_tx sd_mod raid6_pq raid1 raid0 multipath linear md_mod ast drm_vram_helper
Nov 30 04:52:07 vhost002 kernel:  i2c_algo_bit ttm drm_kms_helper ahci libahci libata drm mxm_wmi scsi_mod crc32c_intel wmi button
Nov 30 04:52:07 vhost002 kernel: CPU: 5 PID: 0 Comm: swapper/5 Not tainted 5.3.0-0.bpo.2-amd64 #1 Debian 5.3.9-2~bpo10+1
Nov 30 04:52:07 vhost002 kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./EPC612D4I, BIOS P2.30 05/09/2017
Nov 30 04:52:07 vhost002 kernel: RIP: 0010:dev_watchdog+0x253/0x260
Nov 30 04:52:07 vhost002 kernel: Code: 48 85 c0 75 e4 eb 9d 4c 89 ef c6 05 c6 76 a9 00 01 e8 d1 1f fb ff 89 d9 4c 89 ee 48 c7 c7 a0 d4 10 bb 48 89 c2 e8 f6 bc a1 ff <0f> 0b e9 7c ff ff ff 66 0f 1f 44 00 00 0f 1f 44 00 00 41 57 41 56
Nov 30 04:52:07 vhost002 kernel: RSP: 0018:ffffb43880200e88 EFLAGS: 00010282
Nov 30 04:52:07 vhost002 kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
Nov 30 04:52:07 vhost002 kernel: RDX: 0000000000040400 RSI: 00000000000000f6 RDI: 0000000000000300
Nov 30 04:52:07 vhost002 kernel: RBP: ffff9b49a680845c R08: 00000000000004a9 R09: 0000000000000004
Nov 30 04:52:07 vhost002 kernel: R10: 0000000000000000 R11: 0000000000000001 R12: ffff9b52e7d9a680
Nov 30 04:52:07 vhost002 kernel: R13: ffff9b49a6808000 R14: ffff9b49a6808480 R15: 0000000000000001
Nov 30 04:52:07 vhost002 kernel: FS:  0000000000000000(0000) GS:ffff9b57bf940000(0000) knlGS:0000000000000000
Nov 30 04:52:07 vhost002 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 30 04:52:07 vhost002 kernel: CR2: 00007f07aa0c67f8 CR3: 00000006be40a004 CR4: 00000000001626e0
Nov 30 04:52:07 vhost002 kernel: Call Trace:
Nov 30 04:52:07 vhost002 kernel:  <IRQ>
Nov 30 04:52:07 vhost002 kernel:  ? pfifo_fast_enqueue+0x140/0x140
Nov 30 04:52:07 vhost002 kernel:  call_timer_fn+0x2d/0x130
Nov 30 04:52:07 vhost002 kernel:  run_timer_softirq+0x19e/0x410
Nov 30 04:52:07 vhost002 kernel:  ? tick_sched_timer+0x37/0x70

Nov 30 04:52:07 vhost002 kernel:  ? __hrtimer_run_queues+0x110/0x280
Nov 30 04:52:07 vhost002 kernel:  ? recalibrate_cpu_khz+0x10/0x10
Nov 30 04:52:07 vhost002 kernel:  ? ktime_get+0x3a/0xa0
Nov 30 04:52:07 vhost002 kernel:  __do_softirq+0xdf/0x2e5
Nov 30 04:52:07 vhost002 kernel:  irq_exit+0xa3/0xb0
Nov 30 04:52:07 vhost002 kernel:  smp_apic_timer_interrupt+0x74/0x130
Nov 30 04:52:07 vhost002 kernel:  apic_timer_interrupt+0xf/0x20
Nov 30 04:52:07 vhost002 kernel:  </IRQ>
Nov 30 04:52:07 vhost002 kernel: RIP: 0010:cpuidle_enter_state+0xbc/0x450
Nov 30 04:52:07 vhost002 kernel: Code: e8 49 66 ae ff 80 7c 24 13 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 67 03 00 00 31 ff e8 8b 87 b4 ff fb 66 0f 1f 44 00 00 <45> 85 e4 0f 89 d1 01 00 00 c7 45 10 00 00 00 00 48 83 c4 18 44 89
Nov 30 04:52:07 vhost002 kernel: RSP: 0018:ffffb438800efe78 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
Nov 30 04:52:07 vhost002 kernel: RAX: ffff9b57bf96a580 RBX: ffffffffbb2b92c0 RCX: 000000000000001f
Nov 30 04:52:07 vhost002 kernel: RDX: 0000003afb87d7a0 RSI: 0000000024924b32 RDI: 0000000000000000
Nov 30 04:52:07 vhost002 kernel: RBP: ffffd4387fb78c00 R08: 0000000000000002 R09: 0000000000029e00
Nov 30 04:52:07 vhost002 kernel: R10: 0002d707d159f557 R11: ffff9b57bf9694e4 R12: 0000000000000004
Nov 30 04:52:07 vhost002 kernel: R13: ffffffffbb2b9458 R14: 0000000000000004 R15: 0000000000000000
Nov 30 04:52:07 vhost002 kernel:  ? cpuidle_enter_state+0x97/0x450
Nov 30 04:52:07 vhost002 kernel:  cpuidle_enter+0x29/0x40
Nov 30 04:52:07 vhost002 kernel:  do_idle+0x228/0x270
Nov 30 04:52:07 vhost002 kernel:  cpu_startup_entry+0x19/0x20
Nov 30 04:52:07 vhost002 kernel:  start_secondary+0x160/0x1b0
Nov 30 04:52:07 vhost002 kernel:  secondary_startup_64+0xa4/0xb0
Nov 30 04:52:07 vhost002 kernel: ---[ end trace e68756647ea68b2b ]---
Nov 30 04:52:07 vhost002 kernel: e1000e 0000:00:19.0 eth1: Reset adapter unexpectedly
Nov 30 04:52:07 vhost002 kernel: bond0: (slave eth1): speed changed to 0 on port 2
Nov 30 04:52:07 vhost002 kernel: bond0: (slave eth1): link status definitely down, disabling slave

-- System Information:
Debian Release: 10.2
  APT prefers stable-updates
  APT policy: (500, 'stable-updates'), (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: linux-image-5.3.0-0.bpo.2-amd64 (SMP w/12 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

 


--- End Message ---
--- Begin Message ---
Hi

This bug was filed for a very old kernel or the bug is old itself
without resolution.

If you can reproduce it with

- the current version in unstable/testing
- the latest kernel from backports

please reopen the bug, see https://www.debian.org/Bugs/server-control
for details.

Regards,
Salvatore

--- End Message ---

Reply to: