Re: [PPC] Boot problems after the pci-v6.18-changes
- To: Herve Codina <herve.codina@bootlin.com>
- Cc: Manivannan Sadhasivam <mani@kernel.org>, Lukas Wunner <lukas@wunner.de>, Christian Zigotzky <chzigotzky@xenosoft.de>, Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>, Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>, "linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>, mad skateman <madskateman@gmail.com>, "R.T.Dickinson" <rtd2@xtra.co.nz>, Christian Zigotzky <info@xenosoft.de>, linuxppc-dev <linuxppc-dev@lists.ozlabs.org>, hypexed@yahoo.com.au, Darren Stevens <darren@stevens-zone.net>, "debian-powerpc@lists.debian.org" <debian-powerpc@lists.debian.org>, Thomas Petazzoni <thomas.petazzoni@bootlin.com>
- Subject: Re: [PPC] Boot problems after the pci-v6.18-changes
- From: Bjorn Helgaas <helgaas@kernel.org>
- Date: Wed, 15 Oct 2025 18:40:59 -0500
- Message-id: <[🔎] 20251015234059.GA961901@bhelgaas>
- In-reply-to: <[🔎] 20251015101304.3ec03e6b@bootlin.com>
On Wed, Oct 15, 2025 at 10:13:04AM +0200, Herve Codina wrote:
> ...
> I also observed issues with the commit f3ac2ff14834 ("PCI/ASPM: Enable all
> ClockPM and ASPM states for devicetree platforms")
>
> My system is an ARM board (Marvel Armada 3720 DDB)
> https://elixir.bootlin.com/linux/v6.17.1/source/arch/arm64/boot/dts/marvell/armada-3720-db.dts
>
> I use an LAN966x PCI board
> https://elixir.bootlin.com/linux/v6.17.1/source/drivers/misc/lan966x_pci.c
>
> Usually, when I did a ping using the PCI board, I have more or less the
> following timings:
> # ping 192.168.32.100
> PING 192.168.32.100 (192.168.32.100): 56 data bytes
> 64 bytes from 192.168.32.100: seq=0 ttl=64 time=3.328 ms
> 64 bytes from 192.168.32.100: seq=1 ttl=64 time=2.636 ms
> 64 bytes from 192.168.32.100: seq=2 ttl=64 time=2.928 ms
> 64 bytes from 192.168.32.100: seq=3 ttl=64 time=2.649 ms
>
> But with a vanilla v6.18-rc1 kernel, those timings become awful:
> # ping 192.168.32.100
> PING 192.168.32.100 (192.168.32.100): 56 data bytes
> 64 bytes from 192.168.32.100: seq=0 ttl=64 time=656.634 ms
> 64 bytes from 192.168.32.100: seq=1 ttl=64 time=551.812 ms
> 64 bytes from 192.168.32.100: seq=2 ttl=64 time=702.966 ms
> 64 bytes from 192.168.32.100: seq=3 ttl=64 time=725.904 ms
>
> Reverting commit f3ac2ff14834 ("PCI/ASPM: Enable all ClockPM and
> ASPM states for devicetree platforms") fixes my timing issues.
We expect *some* performance impact from enabling ASPM, but this seems
excessive. You should be able to control the ASPM settings for an
individual device via sysfs:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/ABI/testing/sysfs-bus-pci?id=v6.17-rc1#n431
My guess is that L1.2 is enabled and the threshold values in the L1 PM
Substates control registers are bogus. I don't know how to fix those,
especially on a devicetree system. But it might be possible to fiddle
with them using setpci (while ASPM is disabled). Not for the faint of
heart.
Bjorn
Reply to: