--- Begin Message ---
- To: submit@bugs.debian.org
- Subject: Kernel panic caused by aacraid module prevents normal boot
- From: Michael Gordon <m.gordon.zelenoborsky@gmail.com>
- Date: Sat, 6 Jul 2024 19:15:32 +0300
- Message-id: <CAGUOxdeYacXLUGRdrcBtHkgP0ghL76t5_bWj5BQVEXhJ-J_1Wg@mail.gmail.com>
Package: linux-image-amd64
Version: 6.7.12-1
Hardware: Huananzhi F8D-Plus (C612 Intel Chipset), 2*Xeon 2680v4, 128 GB RAM
Raid controller: Adaptec ASR-5805Z
Hello!
I've encountered an issue booting up Debian Testing with the latest kernel. The system reports something about kernel panic caused by aacraid module, then hangs completely during fsck step. Information about the installed system is in the attached .txt file. Note that hardware is different, because the SSD with Debian Testing has been extracted from the server for investigation.
Using the kernel parameters "pci=nocrs single" I was able to boot into emergency mode and see the error messages related to kernel panic. Photos attached.
Pulling the RAID controller out of the PCI-E slot restores normal boot.
I tried different linux distributions to check hypothesis about regression in the latest kernel. Results are shown below.
Distribution |
Kernel version |
Result |
Debian
Testing KDE |
Linux
6.6.15-amd64 #1 Debian 6.6.15-2 |
Kernel panic
messages, boot hangs at fsck step. |
Kubuntu 24.04 LTS |
Linux
KubuntuPortable 6.8.0-36-generic #36-Ubuntu SMP PREEMPT_DYNAMIC Mon Jun 10
10:49:14 UTC 2024 x86_64 GNU/Linux |
Kernel
panic messages, boot hangs at fsck step. |
Fedora Workstation Live x86_64-40-1.14 |
Linux
localhost-live 6.8.5-301.fc40.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Apr 11
20:00:10 UTC 2024 x86_64 GNU/Linux |
Boot
hangs. |
Linux Mint 21.1 Xfce 64-bit |
Linux
mint 5.15.0-56-generic #62-Ubuntu SMP Tue Nov 19:54:14 UTC 2022 x86_64
GNU/Linux |
Normal
boot, ext4 partition on the RAID5 array is accessible. |
Debian Stable KDE Live |
Linux
debian 6.1.0-22-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.94-1 (2024-06-21)
x86_64 GNU/Linux |
Normal
boot, ext4 partition on the RAID5 array is accessible. |
Windows Server 2016 |
NA |
Normal
boot. Adaptec Storage Manager reports optimal condition for the RAID5 array.
No errors reported for HDDs. Ext4 partition on the array is accessible from
Windows. |
Expected behavior: normal boot
Actual result: system hangs during boot
As a workaround, I installed kernel 6.1.0 from the Bookworm repository.
Gordon Mikhail, bioinformatician
Genetics of Plant-Microbe Interactions lab | ARRIAM
Guar Physiological genetics lab | VIR
michaelgordon@Lab9-X99 ~> apt list --installed | grep "linux-image" (base)
WARNING: apt does not have a stable CLI interface. Use with caution in scripts.
linux-image-6.1.0-22-amd64/stable,now 6.1.94-1 amd64 [installed]
linux-image-6.6.15-amd64/now 6.6.15-2 amd64 [installed,local]
linux-image-6.7.12-amd64/now 6.7.12-1 amd64 [installed,local]
linux-image-6.8.12-amd64/now 6.8.12-1 amd64 [installed,local]
linux-image-amd64/now 6.8.12-1 amd64 [installed,upgradable to: 6.9.7-1]
michaelgordon@Lab9-X99 ~> apt show libc6 | grep ^Version (base)
WARNING: apt does not have a stable CLI interface. Use with caution in scripts.
Version: 2.38-13
michaelgordon@Lab9-X99 ~> neofetch (base)
_,met$$$$$gg. michaelgordon@Lab9-X99
,g$$$$$$$$$$$$$$$P. ----------------------
,g$$P" """Y$$.". OS: Debian GNU/Linux trixie/sid x86_64
,$$P' `$$$. Host: MS-7D31 1.0
',$$P ,ggs. `$$b: Kernel: 6.7.12-amd64
`d$$' ,$P"' . $$$ Uptime: 42 secs
$$P d$' , $$P Packages: 2378 (dpkg)
$$: $$. - ,d$$' Shell: fish 3.7.1
$$; Y$b._ _,d$P' Resolution: 3840x2160
Y$$. `.`"Y$$$$P"' DE: Plasma 5.27.10
`$$b "-.__ WM: kwin
`Y$$ Theme: [Plasma], Breeze [GTK2/3]
`Y$$. Icons: [Plasma], breeze [GTK2/3]
`$$b. Terminal: konsole
`Y$$b. Terminal Font: Hack 12
`"Y$b._ CPU: 12th Gen Intel i5-12600K (16) @ 4.900GHz
`""" GPU: Intel AlderLake-S GT1
Memory: 1909MiB / 64099MiB
michaelgordon@Lab9-X99 ~> uname -a (base)
Linux Lab9-X99 6.7.12-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.7.12-1 (2024-04-24) x86_64 GNU/Linux
michaelgordon@Lab9-X99 ~ [1]> sudo modinfo aacraid (base)
filename: /lib/modules/6.7.12-amd64/kernel/drivers/scsi/aacraid/aacraid.ko.xz
version: 1.2.1[50983]-custom
license: GPL
description: Dell PERC2, 2/Si, 3/Si, 3/Di, Adaptec Advanced Raid Products, HP NetRAID-4M, IBM ServeRAID & ICP SCSI driver
author: Red Hat Inc and Adaptec
srcversion: 09ED6D97B3A3060995BB34B
alias: pci:v00009005d0000028Dsv*sd*bc*sc*i*
alias: pci:v00009005d0000028Csv*sd*bc*sc*i*
alias: pci:v00009005d0000028Bsv*sd*bc*sc*i*
alias: pci:v00009005d00000288sv*sd*bc*sc*i*
alias: pci:v00009005d00000286sv*sd*bc*sc*i*
alias: pci:v00009005d00000285sv*sd*bc*sc*i*
alias: pci:v00009005d00000285sv000017AAsd*bc*sc*i*
alias: pci:v00009005d00000285sv00001028sd*bc*sc*i*
alias: pci:v00001011d00000046sv0000103Csd000010C2bc*sc*i*
alias: pci:v00001011d00000046sv00009005sd00001364bc*sc*i*
alias: pci:v00001011d00000046sv00009005sd00000364bc*sc*i*
alias: pci:v00001011d00000046sv00009005sd00000365bc*sc*i*
alias: pci:v00009005d00000285sv00001028sd00000287bc*sc*i*
alias: pci:v00009005d00000286sv00009005sd000002A2bc*sc*i*
alias: pci:v00009005d00000285sv00009005sd0000029Abc*sc*i*
alias: pci:v00009005d00000285sv00009005sd00000299bc*sc*i*
alias: pci:v00009005d00000285sv00009005sd00000298bc*sc*i*
alias: pci:v00009005d00000286sv00001014sd00009540bc*sc*i*
alias: pci:v00009005d00000286sv00001014sd00009580bc*sc*i*
alias: pci:v00009005d00000285sv00001014sd00000312bc*sc*i*
alias: pci:v00009005d00000285sv00001014sd000002F2bc*sc*i*
alias: pci:v00009005d00000285sv00009005sd00000297bc*sc*i*
alias: pci:v00009005d00000285sv00009005sd00000296bc*sc*i*
alias: pci:v00009005d00000285sv0000103Csd00003227bc*sc*i*
alias: pci:v00009005d00000285sv00009005sd00000294bc*sc*i*
alias: pci:v00009005d00000285sv00009005sd00000293bc*sc*i*
alias: pci:v00009005d00000285sv00009005sd00000292bc*sc*i*
alias: pci:v00009005d00000285sv00001028sd00000291bc*sc*i*
alias: pci:v00009005d00000285sv00009005sd00000290bc*sc*i*
alias: pci:v00009005d00000285sv00009005sd0000028Fbc*sc*i*
alias: pci:v00009005d00000285sv00009005sd0000028Ebc*sc*i*
alias: pci:v00009005d00000286sv00009005sd00000800bc*sc*i*
alias: pci:v00009005d00000200sv00009005sd00000200bc*sc*i*
alias: pci:v00009005d00000287sv00009005sd00000800bc*sc*i*
alias: pci:v00009005d00000286sv00009005sd000002A6bc*sc*i*
alias: pci:v00009005d00000285sv00009005sd000002A5bc*sc*i*
alias: pci:v00009005d00000285sv00009005sd000002A4bc*sc*i*
alias: pci:v00009005d00000286sv00009005sd000002A3bc*sc*i*
alias: pci:v00009005d00000286sv00009005sd000002A1bc*sc*i*
alias: pci:v00009005d00000286sv00009005sd000002A0bc*sc*i*
alias: pci:v00009005d00000286sv00009005sd0000029Fbc*sc*i*
alias: pci:v00009005d00000286sv00009005sd0000029Ebc*sc*i*
alias: pci:v00009005d00000286sv00009005sd0000029Dbc*sc*i*
alias: pci:v00009005d00000286sv00009005sd0000029Cbc*sc*i*
alias: pci:v00009005d00000286sv00009005sd0000029Bbc*sc*i*
alias: pci:v00009005d00000286sv00009005sd0000028Dbc*sc*i*
alias: pci:v00009005d00000286sv00009005sd0000028Cbc*sc*i*
alias: pci:v00009005d00000285sv00009005sd0000028Bbc*sc*i*
alias: pci:v00009005d00000285sv00009005sd0000028Abc*sc*i*
alias: pci:v00009005d00000285sv00009005sd00000289bc*sc*i*
alias: pci:v00009005d00000285sv00009005sd00000288bc*sc*i*
alias: pci:v00009005d00000285sv000017AAsd00000287bc*sc*i*
alias: pci:v00009005d00000285sv000017AAsd00000286bc*sc*i*
alias: pci:v00009005d00000285sv00009005sd00000287bc*sc*i*
alias: pci:v00009005d00000285sv00009005sd00000285bc*sc*i*
alias: pci:v00009005d00000285sv00009005sd00000286bc*sc*i*
alias: pci:v00009005d00000284sv00009005sd00000284bc*sc*i*
alias: pci:v00009005d00000283sv00009005sd00000283bc*sc*i*
alias: pci:v00001028d0000000Asv00001028sd00000121bc*sc*i*
alias: pci:v00001028d0000000Asv00001028sd0000011Bbc*sc*i*
alias: pci:v00001028d0000000Asv00001028sd00000106bc*sc*i*
alias: pci:v00001028d00000002sv00001028sd000000D9bc*sc*i*
alias: pci:v00001028d00000002sv00001028sd000000D1bc*sc*i*
alias: pci:v00001028d00000004sv00001028sd000000D0bc*sc*i*
alias: pci:v00001028d00000003sv00001028sd00000003bc*sc*i*
alias: pci:v00001028d00000002sv00001028sd00000002bc*sc*i*
alias: pci:v00001028d00000001sv00001028sd00000001bc*sc*i*
depends: scsi_mod,scsi_common
retpoline: Y
intree: Y
name: aacraid
vermagic: 6.7.12-amd64 SMP preempt mod_unload modversions
sig_id: PKCS#7
signer: Build time autogenerated kernel key
sig_key: 19:1B:46:85:E7:78:75:68:47:6B:25:14:BE:9C:B5:47:22:1D:59:C9
sig_hashalgo: sha256
signature: 77:D2:58:2A:86:D3:C2:1C:B1:60:C9:6D:B4:C3:EA:91:E6:8C:A0:97:
AD:B7:F8:70:47:C2:B3:75:B3:BE:1F:07:25:A2:AF:E5:2B:20:29:23:
17:67:24:A5:65:99:46:28:07:FF:8F:23:42:04:81:6F:53:8E:A7:1A:
93:6F:FD:62:7D:FF:54:FC:5C:D6:7A:65:BE:DA:5E:37:50:3F:41:51:
9C:3D:F2:E5:42:E9:4A:09:A2:39:29:CC:8F:F0:B8:4C:51:D3:3E:1A:
1F:75:73:8B:DE:E6:AD:AB:76:BE:FB:48:DD:9D:CE:3A:7B:38:94:76:
DC:EB:94:88:AB:46:E0:67:9C:28:C8:BC:8E:E6:5C:20:F4:EB:85:2E:
F1:01:B9:A9:35:F1:49:46:BF:2A:7D:56:7F:21:60:1C:97:F4:39:64:
2E:A1:7C:75:38:45:75:D1:52:BD:81:50:52:B1:E5:83:9D:8B:DC:5E:
B3:E6:85:E8:FE:BD:D4:B7:E0:45:5B:C8:6D:3E:0C:70:6A:74:6F:CF:
8C:BC:A9:8C:65:FB:FE:E7:46:F8:3E:BB:4E:5C:02:84:A4:57:D2:2F:
78:E4:EA:6F:E2:83:25:E9:A1:12:39:B6:4D:8B:91:95:1D:04:91:7C:
BF:A0:D4:3F:51:F9:1F:E4:4C:98:52:86:70:3D:65:B2:AF:D5:4A:21:
50:F0:0D:81:53:08:39:CE:B4:76:65:59:85:CF:E0:89:88:2A:E1:29:
07:C9:AB:1A:07:FC:E6:80:2F:E1:E6:72:60:8F:13:75:AD:69:30:59:
29:FC:8F:47:9F:05:39:31:E1:7B:B6:35:AD:34:88:C1:17:01:A0:11:
AF:2B:96:13:EF:08:BA:99:8B:27:D5:F2:C3:E8:61:97:14:52:C9:31:
9D:EF:66:BD:8E:CD:AF:78:F3:95:1D:A8:64:DC:52:4F:C0:00:F8:E4:
5B:39:11:87:39:FE:DC:F9:2A:3D:FD:6B:72:81:9F:70:FE:76:05:12:
0E:7A:09:BC:A5:75:C6:8E:D5:2B:EC:C1:9B:EA:AA:5A:C6:F4:53:80:
89:E5:8D:E5:7C:22:C7:E1:D2:94:48:75:25:E7:DC:74:13:C5:C9:42:
DB:3F:61:A4:94:94:3E:D5:06:F6:45:90:D7:2E:89:DC:82:A7:04:4D:
44:B3:56:2C:94:AC:BD:E8:3B:28:59:F2:E9:2D:AD:26:D6:63:B8:7C:
03:61:F5:90:3E:96:AC:39:EB:C9:AD:6C:E3:88:7F:F7:8C:3C:7A:D0:
41:99:43:2B:DB:47:05:F2:D7:9B:9D:DD:79:4F:95:7A:50:FE:7B:3C:
10:C8:DC:09:88:D7:44:78:44:20:2C:08
parm: aac_sync_mode:Force sync. transfer mode 0=off, 1=on (int)
parm: aac_convert_sgl:Convert non-conformable s/g list 0=off, 1=on (int)
parm: nondasd:Control scanning of hba for nondasd devices. 0=off, 1=on (int)
parm: cache:Disable Queue Flush commands:
bit 0 - Disable FUA in WRITE SCSI commands
bit 1 - Disable SYNCHRONIZE_CACHE SCSI command
bit 2 - Disable only if Battery is protecting Cache (int)
parm: dacmode:Control whether dma addressing is using 64 bit DAC. 0=off, 1=on (int)
parm: commit:Control whether a COMMIT_CONFIG is issued to the adapter for foreign arrays.
This is typically needed in systems that do not have a BIOS. 0=off, 1=on (int)
parm: msi:IRQ handling. 0=PIC(default), 1=MSI, 2=MSI-X) (int)
parm: startup_timeout:The duration of time in seconds to wait for adapter to have its kernel up and
running. This is typically adjusted for large systems that do not have a BIOS. (int)
parm: aif_timeout:The duration of time in seconds to wait for applications to pick up AIFs before
deregistering them. This is typically adjusted for heavily burdened systems. (int)
parm: aac_fib_dump:Dump controller fibs prior to IOP_RESET 0=off, 1=on (int)
parm: numacb:Request a limit to the number of adapter control blocks (FIB) allocated. Valid values are 512 and down. Default is to use suggestion from Firmware. (int)
parm: acbsize:Request a specific adapter control block (FIB) size. Valid values are 512, 2048, 4096 and 8192. Default is to use suggestion from Firmware. (int)
parm: update_interval:Interval in seconds between time sync updates issued to adapter. (int)
parm: check_interval:Interval in seconds between adapter health checks. (int)
parm: check_reset:If adapter fails health check, reset the adapter. a value of -1 forces the reset to adapters programmed to ignore it. (int)
parm: expose_physicals:Expose physical components of the arrays. -1=protect 0=off, 1=on (int)
parm: reset_devices:Force an adapter reset at initialization. (int)
parm: wwn:Select a WWN type for the arrays:
0 - Disable
1 - Array Meta Data Signature (default)
2 - Adapter Serial Number (int)
--- End Message ---
--- Begin Message ---
- To: Ben Hutchings <ben@decadent.org.uk>, 1075855-done@bugs.debian.org, 1075855-submitter@bugs.debian.org
- Cc: Michael Gordon <m.gordon.zelenoborsky@gmail.com>
- Subject: Re: Bug#1075855: Kernel panic caused by aacraid module prevents normal boot
- From: Salvatore Bonaccorso <carnil@debian.org>
- Date: Sat, 24 Aug 2024 08:55:20 +0200
- Message-id: <ZsmD2PPosJtgq8iM@eldamar.lan>
- In-reply-to: <01d1754a04ed173ba8cac7ab6097e00c20407053.camel@decadent.org.uk>
- References: <CAGUOxdeYacXLUGRdrcBtHkgP0ghL76t5_bWj5BQVEXhJ-J_1Wg@mail.gmail.com> <CAGUOxdeYacXLUGRdrcBtHkgP0ghL76t5_bWj5BQVEXhJ-J_1Wg@mail.gmail.com> <01d1754a04ed173ba8cac7ab6097e00c20407053.camel@decadent.org.uk>
hi,
On Thu, Jul 11, 2024 at 02:07:43AM +0200, Ben Hutchings wrote:
> Control: tag -1 moreinfo
>
> You wrote:
> > I've encountered an issue booting up Debian Testing with the latest
> > kernel. The system reports something about kernel panic caused by
> > aacraid module, then hangs completely during fsck step. Information
> > about the installed system is in the attached .txt file. Note that
> > hardware is different, because the SSD with Debian Testing has been
> > extracted from the server for investigation.
> >
> > Using the kernel parameters "pci=nocrs single" I was able to boot into
> > emergency mode and see the error messages related to kernel panic.
> > Photos attached.
> >
> > Pulling the RAID controller out of the PCI-E slot restores normal boot.
> [...]
>
> There are (at least) 2 bugs here:
>
> 1. Something is preventing aacraid and ehci-hcd from allocating DMA
> buffers:
> "ehci-pci 0000:00:1a.0: init 0000:00:1a.0 fail, -12"
> "aacraid: unable to create mapping."
>
> 2. aacraid then double-frees a chunk of memory while handling the
> failure, causing the panic:
> "kernel BUG at mm/slub.c:448!"
>
> I'm attaching a patch which should fix bug #2, which may help to get
> more information about bug #1.
>
> In principle you should be able to test this by following the
> instructions at
> <https://kernel-team.pages.debian.net/kernel-handbook/ch-common-tasks.html#id-1.6.6.4>
> but currently the test-patches script has not been updated along with
> the package and will take a lot more time and space than it should.
>
> So instead I would suggest building a custom kernel based on the Debian
> configuration, following the instructions at
> <https://kernel-team.pages.debian.net/kernel-handbook/ch-common-tasks.html#s-common-building>
> and applying this patch before you run "make clean".
>
> Let us know if you have any difficulty with this. If the system is
> able to boot with a patched kernel, please send the full kernel log.
As we have seen no followup on this, we are closing the bug report
(for now). In case you can retest with the above, can you please
reopen the bug and removing the moreinfo tag.
Regards,
Salvatore
--- End Message ---