[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: ci.rocm.debian.net: gfx1100, gfx1101 broken with (possibly) 6.12.11



Hi Cory,

On 2025-04-24 07:02, Cordell Bloor wrote:
> On 2025-04-22 10:43, Christian Kastner wrote:
> I don't have any gfx1101 hardware. The only RDNA 3 hardware I have in a
> CI machine is a Radeon PRO W7600 (gfx1102).

My mistake, indeed we talked about gfx1102 recently.

> I have, however, put in a request for a few Radeon PRO V710 (gfx1101)
> samples. I have plenty of PCIe slots, so I'm hoping that those will work
> well with PCIe pass-through. My guess will be that I will have some
> gfx1101 workers in June or July.

I'm afraid this is unlikely, unless someone finds the root cause (e.g.
by bisecting 6.1.129-6.1.133).

If you recall, we initially had troubles getting gfx110x going, and
Brian DeRocher bisected the issue which pointed to AGP, which led to
this illuminating thread [1] where it was mentioned that memory mapping
for gfx110x needed a workaround for a hardware bug.

As attempting to boot a VM with 6.1.133 fails with,

  qemu-system-x86_64: VFIO_MAP_DMA failed: Cannot allocate memory
  qemu-system-x86_64: vfio_dma_map(0x55f2234da190, 0xc0000, 0x20000, 0x7f376e600000) = -12 (Cannot allocate memory)
  qemu: hardware error: vfio: DMA mapping failed, unable to continue

this could very well be related [2].

Perhaps the workaround is incomplete, but it'll take someone to put the
work into bisecting to narrow this down further.

Best,
Christian

[1]: https://gitlab.freedesktop.org/drm/amd/-/issues/3644

[2]: Technically, it could of course also be completely unrelated. But
     if this were some general pass-through issue, we'd see that on
     other architectures, too.


Reply to: