--- Begin Message ---
- To: submit@bugs.debian.org
- Subject: Kernel panic caused by aacraid module prevents normal boot
- From: Michael Gordon <m.gordon.zelenoborsky@gmail.com>
- Date: Sat, 6 Jul 2024 19:15:32 +0300
- Message-id: <CAGUOxdeYacXLUGRdrcBtHkgP0ghL76t5_bWj5BQVEXhJ-J_1Wg@mail.gmail.com>
Package: linux-image-amd64
Version: 6.7.12-1
Hardware: Huananzhi F8D-Plus (C612 Intel Chipset), 2*Xeon 2680v4, 128 GB RAM
Raid controller: Adaptec ASR-5805Z
Hello!
I've encountered an issue booting up Debian Testing with the latest kernel. The system reports something about kernel panic caused by aacraid module, then hangs completely during fsck step. Information about the installed system is in the attached .txt file. Note that hardware is different, because the SSD with Debian Testing has been extracted from the server for investigation.
Using the kernel parameters "pci=nocrs single" I was able to boot into emergency mode and see the error messages related to kernel panic. Photos attached.
Pulling the RAID controller out of the PCI-E slot restores normal boot.
I tried different linux distributions to check hypothesis about regression in the latest kernel. Results are shown below.
Distribution |
Kernel version |
Result |
Debian
Testing KDE |
Linux
6.6.15-amd64 #1 Debian 6.6.15-2 |
Kernel panic
messages, boot hangs at fsck step. |
Kubuntu 24.04 LTS |
Linux
KubuntuPortable 6.8.0-36-generic #36-Ubuntu SMP PREEMPT_DYNAMIC Mon Jun 10
10:49:14 UTC 2024 x86_64 GNU/Linux |
Kernel
panic messages, boot hangs at fsck step. |
Fedora Workstation Live x86_64-40-1.14 |
Linux
localhost-live 6.8.5-301.fc40.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Apr 11
20:00:10 UTC 2024 x86_64 GNU/Linux |
Boot
hangs. |
Linux Mint 21.1 Xfce 64-bit |
Linux
mint 5.15.0-56-generic #62-Ubuntu SMP Tue Nov 19:54:14 UTC 2022 x86_64
GNU/Linux |
Normal
boot, ext4 partition on the RAID5 array is accessible. |
Debian Stable KDE Live |
Linux
debian 6.1.0-22-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.94-1 (2024-06-21)
x86_64 GNU/Linux |
Normal
boot, ext4 partition on the RAID5 array is accessible. |
Windows Server 2016 |
NA |
Normal
boot. Adaptec Storage Manager reports optimal condition for the RAID5 array.
No errors reported for HDDs. Ext4 partition on the array is accessible from
Windows. |
Expected behavior: normal boot
Actual result: system hangs during boot
As a workaround, I installed kernel 6.1.0 from the Bookworm repository.
Gordon Mikhail, bioinformatician
Genetics of Plant-Microbe Interactions lab | ARRIAM
Guar Physiological genetics lab | VIR
michaelgordon@Lab9-X99 ~> apt list --installed | grep "linux-image" (base)
WARNING: apt does not have a stable CLI interface. Use with caution in scripts.
linux-image-6.1.0-22-amd64/stable,now 6.1.94-1 amd64 [installed]
linux-image-6.6.15-amd64/now 6.6.15-2 amd64 [installed,local]
linux-image-6.7.12-amd64/now 6.7.12-1 amd64 [installed,local]
linux-image-6.8.12-amd64/now 6.8.12-1 amd64 [installed,local]
linux-image-amd64/now 6.8.12-1 amd64 [installed,upgradable to: 6.9.7-1]
michaelgordon@Lab9-X99 ~> apt show libc6 | grep ^Version (base)
WARNING: apt does not have a stable CLI interface. Use with caution in scripts.
Version: 2.38-13
michaelgordon@Lab9-X99 ~> neofetch (base)
_,met$$$$$gg. michaelgordon@Lab9-X99
,g$$$$$$$$$$$$$$$P. ----------------------
,g$$P" """Y$$.". OS: Debian GNU/Linux trixie/sid x86_64
,$$P' `$$$. Host: MS-7D31 1.0
',$$P ,ggs. `$$b: Kernel: 6.7.12-amd64
`d$$' ,$P"' . $$$ Uptime: 42 secs
$$P d$' , $$P Packages: 2378 (dpkg)
$$: $$. - ,d$$' Shell: fish 3.7.1
$$; Y$b._ _,d$P' Resolution: 3840x2160
Y$$. `.`"Y$$$$P"' DE: Plasma 5.27.10
`$$b "-.__ WM: kwin
`Y$$ Theme: [Plasma], Breeze [GTK2/3]
`Y$$. Icons: [Plasma], breeze [GTK2/3]
`$$b. Terminal: konsole
`Y$$b. Terminal Font: Hack 12
`"Y$b._ CPU: 12th Gen Intel i5-12600K (16) @ 4.900GHz
`""" GPU: Intel AlderLake-S GT1
Memory: 1909MiB / 64099MiB
michaelgordon@Lab9-X99 ~> uname -a (base)
Linux Lab9-X99 6.7.12-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.7.12-1 (2024-04-24) x86_64 GNU/Linux
michaelgordon@Lab9-X99 ~ [1]> sudo modinfo aacraid (base)
filename: /lib/modules/6.7.12-amd64/kernel/drivers/scsi/aacraid/aacraid.ko.xz
version: 1.2.1[50983]-custom
license: GPL
description: Dell PERC2, 2/Si, 3/Si, 3/Di, Adaptec Advanced Raid Products, HP NetRAID-4M, IBM ServeRAID & ICP SCSI driver
author: Red Hat Inc and Adaptec
srcversion: 09ED6D97B3A3060995BB34B
alias: pci:v00009005d0000028Dsv*sd*bc*sc*i*
alias: pci:v00009005d0000028Csv*sd*bc*sc*i*
alias: pci:v00009005d0000028Bsv*sd*bc*sc*i*
alias: pci:v00009005d00000288sv*sd*bc*sc*i*
alias: pci:v00009005d00000286sv*sd*bc*sc*i*
alias: pci:v00009005d00000285sv*sd*bc*sc*i*
alias: pci:v00009005d00000285sv000017AAsd*bc*sc*i*
alias: pci:v00009005d00000285sv00001028sd*bc*sc*i*
alias: pci:v00001011d00000046sv0000103Csd000010C2bc*sc*i*
alias: pci:v00001011d00000046sv00009005sd00001364bc*sc*i*
alias: pci:v00001011d00000046sv00009005sd00000364bc*sc*i*
alias: pci:v00001011d00000046sv00009005sd00000365bc*sc*i*
alias: pci:v00009005d00000285sv00001028sd00000287bc*sc*i*
alias: pci:v00009005d00000286sv00009005sd000002A2bc*sc*i*
alias: pci:v00009005d00000285sv00009005sd0000029Abc*sc*i*
alias: pci:v00009005d00000285sv00009005sd00000299bc*sc*i*
alias: pci:v00009005d00000285sv00009005sd00000298bc*sc*i*
alias: pci:v00009005d00000286sv00001014sd00009540bc*sc*i*
alias: pci:v00009005d00000286sv00001014sd00009580bc*sc*i*
alias: pci:v00009005d00000285sv00001014sd00000312bc*sc*i*
alias: pci:v00009005d00000285sv00001014sd000002F2bc*sc*i*
alias: pci:v00009005d00000285sv00009005sd00000297bc*sc*i*
alias: pci:v00009005d00000285sv00009005sd00000296bc*sc*i*
alias: pci:v00009005d00000285sv0000103Csd00003227bc*sc*i*
alias: pci:v00009005d00000285sv00009005sd00000294bc*sc*i*
alias: pci:v00009005d00000285sv00009005sd00000293bc*sc*i*
alias: pci:v00009005d00000285sv00009005sd00000292bc*sc*i*
alias: pci:v00009005d00000285sv00001028sd00000291bc*sc*i*
alias: pci:v00009005d00000285sv00009005sd00000290bc*sc*i*
alias: pci:v00009005d00000285sv00009005sd0000028Fbc*sc*i*
alias: pci:v00009005d00000285sv00009005sd0000028Ebc*sc*i*
alias: pci:v00009005d00000286sv00009005sd00000800bc*sc*i*
alias: pci:v00009005d00000200sv00009005sd00000200bc*sc*i*
alias: pci:v00009005d00000287sv00009005sd00000800bc*sc*i*
alias: pci:v00009005d00000286sv00009005sd000002A6bc*sc*i*
alias: pci:v00009005d00000285sv00009005sd000002A5bc*sc*i*
alias: pci:v00009005d00000285sv00009005sd000002A4bc*sc*i*
alias: pci:v00009005d00000286sv00009005sd000002A3bc*sc*i*
alias: pci:v00009005d00000286sv00009005sd000002A1bc*sc*i*
alias: pci:v00009005d00000286sv00009005sd000002A0bc*sc*i*
alias: pci:v00009005d00000286sv00009005sd0000029Fbc*sc*i*
alias: pci:v00009005d00000286sv00009005sd0000029Ebc*sc*i*
alias: pci:v00009005d00000286sv00009005sd0000029Dbc*sc*i*
alias: pci:v00009005d00000286sv00009005sd0000029Cbc*sc*i*
alias: pci:v00009005d00000286sv00009005sd0000029Bbc*sc*i*
alias: pci:v00009005d00000286sv00009005sd0000028Dbc*sc*i*
alias: pci:v00009005d00000286sv00009005sd0000028Cbc*sc*i*
alias: pci:v00009005d00000285sv00009005sd0000028Bbc*sc*i*
alias: pci:v00009005d00000285sv00009005sd0000028Abc*sc*i*
alias: pci:v00009005d00000285sv00009005sd00000289bc*sc*i*
alias: pci:v00009005d00000285sv00009005sd00000288bc*sc*i*
alias: pci:v00009005d00000285sv000017AAsd00000287bc*sc*i*
alias: pci:v00009005d00000285sv000017AAsd00000286bc*sc*i*
alias: pci:v00009005d00000285sv00009005sd00000287bc*sc*i*
alias: pci:v00009005d00000285sv00009005sd00000285bc*sc*i*
alias: pci:v00009005d00000285sv00009005sd00000286bc*sc*i*
alias: pci:v00009005d00000284sv00009005sd00000284bc*sc*i*
alias: pci:v00009005d00000283sv00009005sd00000283bc*sc*i*
alias: pci:v00001028d0000000Asv00001028sd00000121bc*sc*i*
alias: pci:v00001028d0000000Asv00001028sd0000011Bbc*sc*i*
alias: pci:v00001028d0000000Asv00001028sd00000106bc*sc*i*
alias: pci:v00001028d00000002sv00001028sd000000D9bc*sc*i*
alias: pci:v00001028d00000002sv00001028sd000000D1bc*sc*i*
alias: pci:v00001028d00000004sv00001028sd000000D0bc*sc*i*
alias: pci:v00001028d00000003sv00001028sd00000003bc*sc*i*
alias: pci:v00001028d00000002sv00001028sd00000002bc*sc*i*
alias: pci:v00001028d00000001sv00001028sd00000001bc*sc*i*
depends: scsi_mod,scsi_common
retpoline: Y
intree: Y
name: aacraid
vermagic: 6.7.12-amd64 SMP preempt mod_unload modversions
sig_id: PKCS#7
signer: Build time autogenerated kernel key
sig_key: 19:1B:46:85:E7:78:75:68:47:6B:25:14:BE:9C:B5:47:22:1D:59:C9
sig_hashalgo: sha256
signature: 77:D2:58:2A:86:D3:C2:1C:B1:60:C9:6D:B4:C3:EA:91:E6:8C:A0:97:
AD:B7:F8:70:47:C2:B3:75:B3:BE:1F:07:25:A2:AF:E5:2B:20:29:23:
17:67:24:A5:65:99:46:28:07:FF:8F:23:42:04:81:6F:53:8E:A7:1A:
93:6F:FD:62:7D:FF:54:FC:5C:D6:7A:65:BE:DA:5E:37:50:3F:41:51:
9C:3D:F2:E5:42:E9:4A:09:A2:39:29:CC:8F:F0:B8:4C:51:D3:3E:1A:
1F:75:73:8B:DE:E6:AD:AB:76:BE:FB:48:DD:9D:CE:3A:7B:38:94:76:
DC:EB:94:88:AB:46:E0:67:9C:28:C8:BC:8E:E6:5C:20:F4:EB:85:2E:
F1:01:B9:A9:35:F1:49:46:BF:2A:7D:56:7F:21:60:1C:97:F4:39:64:
2E:A1:7C:75:38:45:75:D1:52:BD:81:50:52:B1:E5:83:9D:8B:DC:5E:
B3:E6:85:E8:FE:BD:D4:B7:E0:45:5B:C8:6D:3E:0C:70:6A:74:6F:CF:
8C:BC:A9:8C:65:FB:FE:E7:46:F8:3E:BB:4E:5C:02:84:A4:57:D2:2F:
78:E4:EA:6F:E2:83:25:E9:A1:12:39:B6:4D:8B:91:95:1D:04:91:7C:
BF:A0:D4:3F:51:F9:1F:E4:4C:98:52:86:70:3D:65:B2:AF:D5:4A:21:
50:F0:0D:81:53:08:39:CE:B4:76:65:59:85:CF:E0:89:88:2A:E1:29:
07:C9:AB:1A:07:FC:E6:80:2F:E1:E6:72:60:8F:13:75:AD:69:30:59:
29:FC:8F:47:9F:05:39:31:E1:7B:B6:35:AD:34:88:C1:17:01:A0:11:
AF:2B:96:13:EF:08:BA:99:8B:27:D5:F2:C3:E8:61:97:14:52:C9:31:
9D:EF:66:BD:8E:CD:AF:78:F3:95:1D:A8:64:DC:52:4F:C0:00:F8:E4:
5B:39:11:87:39:FE:DC:F9:2A:3D:FD:6B:72:81:9F:70:FE:76:05:12:
0E:7A:09:BC:A5:75:C6:8E:D5:2B:EC:C1:9B:EA:AA:5A:C6:F4:53:80:
89:E5:8D:E5:7C:22:C7:E1:D2:94:48:75:25:E7:DC:74:13:C5:C9:42:
DB:3F:61:A4:94:94:3E:D5:06:F6:45:90:D7:2E:89:DC:82:A7:04:4D:
44:B3:56:2C:94:AC:BD:E8:3B:28:59:F2:E9:2D:AD:26:D6:63:B8:7C:
03:61:F5:90:3E:96:AC:39:EB:C9:AD:6C:E3:88:7F:F7:8C:3C:7A:D0:
41:99:43:2B:DB:47:05:F2:D7:9B:9D:DD:79:4F:95:7A:50:FE:7B:3C:
10:C8:DC:09:88:D7:44:78:44:20:2C:08
parm: aac_sync_mode:Force sync. transfer mode 0=off, 1=on (int)
parm: aac_convert_sgl:Convert non-conformable s/g list 0=off, 1=on (int)
parm: nondasd:Control scanning of hba for nondasd devices. 0=off, 1=on (int)
parm: cache:Disable Queue Flush commands:
bit 0 - Disable FUA in WRITE SCSI commands
bit 1 - Disable SYNCHRONIZE_CACHE SCSI command
bit 2 - Disable only if Battery is protecting Cache (int)
parm: dacmode:Control whether dma addressing is using 64 bit DAC. 0=off, 1=on (int)
parm: commit:Control whether a COMMIT_CONFIG is issued to the adapter for foreign arrays.
This is typically needed in systems that do not have a BIOS. 0=off, 1=on (int)
parm: msi:IRQ handling. 0=PIC(default), 1=MSI, 2=MSI-X) (int)
parm: startup_timeout:The duration of time in seconds to wait for adapter to have its kernel up and
running. This is typically adjusted for large systems that do not have a BIOS. (int)
parm: aif_timeout:The duration of time in seconds to wait for applications to pick up AIFs before
deregistering them. This is typically adjusted for heavily burdened systems. (int)
parm: aac_fib_dump:Dump controller fibs prior to IOP_RESET 0=off, 1=on (int)
parm: numacb:Request a limit to the number of adapter control blocks (FIB) allocated. Valid values are 512 and down. Default is to use suggestion from Firmware. (int)
parm: acbsize:Request a specific adapter control block (FIB) size. Valid values are 512, 2048, 4096 and 8192. Default is to use suggestion from Firmware. (int)
parm: update_interval:Interval in seconds between time sync updates issued to adapter. (int)
parm: check_interval:Interval in seconds between adapter health checks. (int)
parm: check_reset:If adapter fails health check, reset the adapter. a value of -1 forces the reset to adapters programmed to ignore it. (int)
parm: expose_physicals:Expose physical components of the arrays. -1=protect 0=off, 1=on (int)
parm: reset_devices:Force an adapter reset at initialization. (int)
parm: wwn:Select a WWN type for the arrays:
0 - Disable
1 - Array Meta Data Signature (default)
2 - Adapter Serial Number (int)
--- End Message ---
--- Begin Message ---
- To: Michael Gordon <m.gordon.zelenoborsky@gmail.com>
- Cc: Ben Hutchings <ben@decadent.org.uk>, 1075855@bugs.debian.org, 1075855-done@bugs.debian.org
- Subject: Re: Bug#1075855: Kernel panic caused by aacraid module prevents normal boot
- From: Salvatore Bonaccorso <carnil@debian.org>
- Date: Wed, 1 Jan 2025 07:59:54 +0100
- Message-id: <[🔎] Z3Tn6kxcIxEMXacY@eldamar.lan>
- In-reply-to: <06ac71c4cc74ffb6ca24bdc9b3804bdc8dac887c.camel@decadent.org.uk>
- References: <CAGUOxdeYacXLUGRdrcBtHkgP0ghL76t5_bWj5BQVEXhJ-J_1Wg@mail.gmail.com> <01d1754a04ed173ba8cac7ab6097e00c20407053.camel@decadent.org.uk> <ZsmD2PPosJtgq8iM@eldamar.lan> <CAGUOxdcocKiYR9ZMb+jX61WP6gD=9mc+GG3EW4UO1gOi0rCoNw@mail.gmail.com> <CAGUOxdeYacXLUGRdrcBtHkgP0ghL76t5_bWj5BQVEXhJ-J_1Wg@mail.gmail.com> <06ac71c4cc74ffb6ca24bdc9b3804bdc8dac887c.camel@decadent.org.uk>
Hi Michael,
On Thu, Sep 05, 2024 at 01:33:12AM +0200, Ben Hutchings wrote:
> On Wed, 2024-09-04 at 11:18 +0300, Michael Gordon wrote:
> > Dear all,
> > I couldn't test Ben's patch yet, because the server machine had to be
> > online. Now it is possible to pull the server out of rack for several days.
> > I'll check if the patch is working and let you know.
>
> Remember that the patch I wrote is only intended to fix the crash (bug
> 2). The probe failure of aacraid (bug 1) is still a mystery.
>
> > Since I've never messed with linux kernel compilation before, I'm relying
> > on this guide https://passthroughpo.st/patch-kernel-debian . It suggests
> > using a *patch -p1* instead of *debian/bin/test-patches* , *quilt *or
> > *dquilt*. Or maybe it is easier to add two lines by hand.
>
> The Debian Kernel Handbook
> <https://kernel-team.pages.debian.net/kernel-handbook/> is an official
> guide maintained by the Debian kernel team (mostly me), and I would
> recommend that over the blog you found.
>
> The problem with test-patches that I mentioned earlier has been fixed.
>
> > I've also noticed a Greg's message <gregkh@linuxfoundation.org>
> > suggesting to apply this patch to all kernel versions including 5.15. Not
> > sure if the result of the patch could be observed on versions 5.15.*
> >
> [...]
>
> The crash bug was introduced way back in Linux 2.6.15, which is why my
> fix was applied to all supported versions.
Ben's patch to fix the crash bug was applied in 6.11-rc6 and down the
road to various stable series (in particular it is in v6.10.8 and
6.1.108).
Do you still see Bug #1?
I'm closing the bug but please do reopen the bug and rmeoving the
moreinfo tag if you still encounter the issue. In this case you please
attach updated information, fresh bootlogs?
Regards,
Salvatore
--- End Message ---