[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

First experiments with ROCm tests in parallel QEMU VMs



Hi,

I performed the first experiments with parallel QEMU VMs today, and so
far, it's looking good.

Testbed was host ci-worker-ckk02, which already had a 6500 XT attached.
I temporarily took -ckk01 and -ckk02 out of the CI pool, moved the 6800
XT from -ckk01 to -ckk02, and added its PCI-ID to vfio-pci.

First, I ran some autopkgtests in serial, alternating between cards.
This was just to make sure that the pass-through management did not get
confused by two cards on the host.

Then, I proceeded to run two tests in parallel, one per GPU, which also
went fine. Finally, I ran two parallel loops of tests, one per card for
about an hour -- no issues there, either.

Testing was performed by manually invoking autopkgtest using the
autopkgtest-virt-qemu+rocm backend from the rocm-qemu-support [1]
package, eg:

  $ autopkgtest --user=root -B rocrand -- qemu+rocm \
        --ram-size 32678 --cpus 2 --gpu 0e:00.0 \
        /var/lib/debci/qemu+rocm/unstable-amd64.img

(Note the --gpu option)

I went manually, rather than through debci, because I discovered that
when I added Mem/CPU/GPU config options to debci, I forgot to pass them
on to the qemu+rcom backend. This will be fixed soon.

In the meantime, the 6800 XT has been returned to ci-worker-ckk01, and
both hosts have been added back to the worker pool.

Best,
Christian

[1] https://apt.rocm.debian.net/debian/pool/main/r/rocm-dev-tools/


Reply to: