Bug#939697: Kernel regression with b44 and kernels 5.x?
After doing some research online, I was able to find people experiencing
the same issues across other distros with the very same NIC on 5.x kernels:
https://bugzilla.redhat.com/show_bug.cgi?id=1709671 (Fedora)
https://bbs.archlinux.org/viewtopic.php?pid=1844324#p1844324 (Arch)
Playing with the swiotlb kernel parameter is of no help - tried with
256, 512, and 4096, but with those I get a null pointer dereference, no
wired networking at all, and a system that can only be rebooted through
the Magic SysRq key sequence. With swiotlb=force I get "can't perform
DMA" errors instead.
Apparently this is the offending kernel commit:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/kernel/dma/direct.c?h=v5.2.13&id=55897af63091ebc2c3f239c6a6666f748113ac50
...which was merged sometime after 4.19 kernels, and it impacts the
entire 5.x line. That commit introduced the swiotlb stuff on direct DMA
transactions to simplify some stuff), and judging by the stackdumps on
the affected systems, rolling it back should be enough to fix this
issue. As I've said, kernels prior to it (which includes the 4.19 series
used on Stable) are unaffected.
Some users might blame the following commit on b44:
https://github.com/torvalds/linux/commit/0f0ed8282e5bfdc87cdd562e58f3d90d893e7ee5#diff-5ab1294594ceb973d7ba266e32b767ea
...but it seems irrelevant IMO, despite being the sole actual change on
the b44 driver source merged this year. But then, I'm no kernel hacker.
Surprisingly, b44 and swiotlb do have some background of not playing
nice: https://groups.google.com/forum/#!topic/linux.kernel/GEx80ZCue1o
These wired Broadcom NICs are present on most 2005~2008-era Dell laptops
(most notably the Inspiron 6400/E1505/640m series, which are Core/Core 2
builds still being widely used, over a decade after its introduction),
some Dell Optiplex desktops, and some HP computers (IIRC the Compaq
nx6110 laptop also has a BCM4401 wired NIC). This bug not only breaks
wired networking, it can also completely cripple networking support on
said systems causing NetworkManager to fail or hang, blocking you from
using any alternative NIC (be it wired or wireless, no matter the
chipset) on said computers.
Reply to: