Package: linux-image-4.9.0-6-amd64
Version: 4.9.82-1+deb9u3
Hi!
Here's a short problem description.
We have some Supermicro servers with the same configuration for all machines (hardware, kernels, packages, etc). A month ago, or maybe a bit later, all of these machines began crashing into kernel panic. I can't find any pattern of failure at all. But it happens very often. Some machines may drop into kernel panic a couple times a day! But usually machines crash about every 3 to 6 days. All of these machines have intensive network and i/o operations.
I saved dmesg log from one of these machines after the crash (see the attachment).
As far as I see, every machine probably has problems with mlx4_en or GRO. Also I see list_add double add => list_del corruption. Can I do anything to get more detailed logs? What additional information do you need for better problem diagnostics?