[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1060706: linux-image-6.1.0-17-amd64: intel i225 NIC loses PCIe link, network becomes unusable)



On Friday, 9 February 2024 13:39:23 CET Arno Lehmann wrote:
> [Fr Feb  9 13:25:08 2024] CPU: 20 PID: 84300 Comm: kworker/20:0 Not
> tainted 6.5.0-0.deb12.4-amd64 #1  Debian 6.5.10-1~bpo12+1
> [Fr Feb  9 13:25:08 2024] Hardware name: ASUS System Product Name/ROG
> STRIX X670E-A GAMING WIFI, BIOS 1904 01/29/2024

I see you have (now) an up-to-date BIOS. Good.

> [Fr Feb  9 13:25:08 2024] Workqueue: events igc_watchdog_task [igc]
> [Fr Feb  9 13:25:08 2024] RIP: 0010:igc_rd32+0x8d/0xa0 [igc]
> [Fr Feb  9 13:25:08 2024] Code: 48 c7 c6 10 36 3a c0 e8 81 aa dd e6 48
> 8b bb 28 ff ff ff e8 05 12 b4 e6 84 c0 74 bc 89 ee 48 c7 c7 38 36 3a c0
> e8 c3 2e 53 e6 <0f> 0b eb aa b8 ff ff ff ff e9 15 0f 04 e7 0f 1f 44 00
> 00 90 90 90
> [Fr Feb  9 13:25:08 2024] RSP: 0018:ffffb034cc61bdd8 EFLAGS: 00010282
> [Fr Feb  9 13:25:08 2024] RAX: 0000000000000000 RBX: ffff97078f882cb8
> RCX: 0000000000000027
> [Fr Feb  9 13:25:08 2024] RDX: ffff97169e7213c8 RSI: 0000000000000001
> RDI: ffff97169e7213c0
> [Fr Feb  9 13:25:08 2024] RBP: 000000000000c030 R08: 0000000000000000
> R09: ffffb034cc61bc68
> [Fr Feb  9 13:25:08 2024] R10: 0000000000000003 R11: ffff9716dde3ac28
> R12: ffff97078f882000
> [Fr Feb  9 13:25:08 2024] R13: 0000000000000000 R14: ffff970784592d40
> R15: 000000000000c030
> [Fr Feb  9 13:25:08 2024] FS:  0000000000000000(0000)
> GS:ffff97169e700000(0000) knlGS:0000000000000000
> [Fr Feb  9 13:25:08 2024] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [Fr Feb  9 13:25:08 2024] CR2: 00007f5271155f80 CR3: 0000000434bc6000
> CR4: 0000000000750ee0
> [Fr Feb  9 13:25:08 2024] PKRU: 55555554
> [Fr Feb  9 13:25:08 2024] Call Trace:
> [Fr Feb  9 13:25:08 2024]  <TASK>
> [Fr Feb  9 13:25:08 2024]  ? igc_rd32+0x8d/0xa0 [igc]
> [Fr Feb  9 13:25:08 2024]  ? __warn+0x81/0x130
> [Fr Feb  9 13:25:08 2024]  ? igc_rd32+0x8d/0xa0 [igc]
> [Fr Feb  9 13:25:08 2024]  ? report_bug+0x171/0x1a0
> [Fr Feb  9 13:25:08 2024]  ? srso_alias_return_thunk+0x5/0x7f
> [Fr Feb  9 13:25:08 2024]  ? prb_read_valid+0x1b/0x30
> [Fr Feb  9 13:25:08 2024]  ? handle_bug+0x41/0x70
> [Fr Feb  9 13:25:08 2024]  ? exc_invalid_op+0x17/0x70
> [Fr Feb  9 13:25:08 2024]  ? asm_exc_invalid_op+0x1a/0x20
> [Fr Feb  9 13:25:08 2024]  ? igc_rd32+0x8d/0xa0 [igc]
> [Fr Feb  9 13:25:08 2024]  ? igc_rd32+0x8d/0xa0 [igc]
> [Fr Feb  9 13:25:08 2024]  igc_update_stats+0x8a/0x6d0 [igc]
> [Fr Feb  9 13:25:08 2024]  igc_watchdog_task+0x9d/0x4a0 [igc]
> [Fr Feb  9 13:25:08 2024]  process_one_work+0x1df/0x3e0
> [Fr Feb  9 13:25:08 2024]  worker_thread+0x51/0x390
> [Fr Feb  9 13:25:08 2024]  ? __pfx_worker_thread+0x10/0x10
> [Fr Feb  9 13:25:08 2024]  kthread+0xe5/0x120
> [Fr Feb  9 13:25:08 2024]  ? __pfx_kthread+0x10/0x10
> [Fr Feb  9 13:25:08 2024]  ret_from_fork+0x31/0x50
> [Fr Feb  9 13:25:08 2024]  ? __pfx_kthread+0x10/0x10
> [Fr Feb  9 13:25:08 2024]  ret_from_fork_asm+0x1b/0x30
> [Fr Feb  9 13:25:08 2024]  </TASK>
> [Fr Feb  9 13:25:08 2024] ---[ end trace 0000000000000000 ]---
>
> Can anybody suggest what information I can provide to tackle this?

I think it's best to take this issue upstream.

$ scripts/get_maintainer.pl drivers/net/ethernet/intel/igc/ returned this:
Jesse Brandeburg <jesse.brandeburg@intel.com> (supporter:INTEL ETHERNET DRIVERS)
Tony Nguyen <anthony.l.nguyen@intel.com> (supporter:INTEL ETHERNET DRIVERS)
"David S. Miller" <davem@davemloft.net> (maintainer:NETWORKING DRIVERS)
Eric Dumazet <edumazet@google.com> (maintainer:NETWORKING DRIVERS)
Jakub Kicinski <kuba@kernel.org> (maintainer:NETWORKING DRIVERS)
Paolo Abeni <pabeni@redhat.com> (maintainer:NETWORKING DRIVERS)
intel-wired-lan@lists.osuosl.org (moderated list:INTEL ETHERNET DRIVERS)
netdev@vger.kernel.org (open list:NETWORKING DRIVERS)
linux-kernel@vger.kernel.org (open list)

To do that, I'd certainly send an email to netdev@vger.kernel.org as that is
the Mailing List. You can choose to add others from that list too.
In that email I recommend to include the following info:
- Description of the problems: I'd focus on the NIC stuff, but do also mention
  the issue you encountered with NVMe.
- A list or table with the kernel versions you detected the problem with. 
  Try to find/use the upstream version as the Debian version (6.1.0-17) is
  often not (that) useful for the upstream maintainers. `uname -a` will show
  both. Via https://tracker.debian.org/pkg/linux I found that 6.1.0-17 is
  upstream version 6.1.69 as the 6.1.69-1 upload had "Bump ABI to 17" at the
  end of the changelog.
  IIUC this is not a regression; mention that too.
- A/The stacktrace(s) you got. This usually allows the upstream maintainers
  to pinpoint where the problem lies.

HTH

Attachment: signature.asc
Description: This is a digitally signed message part.


Reply to: