Hey folks,
I'd like to introduce myself. And try to help out.
I've been using Debian for over 15 years on my home server.
Just bought some new hardware.
I'd like to keep Debian stable on my host, and pass the GPU into VMs as needed for data science projects.
As I get this working, I'd like to share my notes and test
results. Through Level1Techs, Proxmox, and Unraid, there's a lot
of confusion out there. Though this proxmox page [1] is pretty
good. Arch pages are pretty good too.
[1] https://pve.proxmox.com/wiki/PCI(e)_Passthrough
I've tried to follow instructions here [2], but no success yet.
There are a couple typos here I'd like to fix.
[2]
https://salsa.debian.org/rocm-team/community/team-project/-/wikis/qemu-with-gpu-pass-through
When I identify the devices to pass through, and subsequently find other devices in the same IOMMU group, I find these (aside from the sound card):
Should these be passed through too?
My other question for now is, my host is Debian Bookworm running
6.1.0-22-amd64. Is this too old?
In my guest OS, based on qemu-rocm-build, I was seeing that 2 firmware files were not being found.
So I copied them from here [3], the whole folder, not just the 2
above.
[3] https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/plain/amdgpu/
But this results in a bad crash.
[----------] 42 tests from rocrand_basic_tests/rocrand_basic_tests [ RUN ] rocrand_basic_tests/rocrand_basic_tests.rocrand_create_destroy_generator_test/0 [ OK ] rocrand_basic_tests/rocrand_basic_tests.rocrand_create_destroy_generator_test/0 (0 ms) [ RUN ] rocrand_basic_tests/rocrand_basic_tests.rocrand_create_destroy_generator_test/1 error: kvm run failed Bad address RAX=0000000000003398 RBX=0000000000000674 RCX=00030001070de073 RDX=0000000000000673 RSI=ff623407f0003398 RDI=ff3ceee645b00000 RBP=ff3ceee64717d4e0 RSP=ff623407c066b718 R8 =0003000000000073 R9 =ff623407f0000000 R10=ff3ceee645b0faf8 R11=ff3ceee6463d04b8 R12=ff3ceee645b00000 R13=0003000000000073 R14=ff623407f0000000 R15=0000000000000674 RIP=ffffffffc1159a13 RFL=00000282 [--S----] CPL=0 II=0 A20=1 SMM=0 HLT=0
So, what's my next step and how can I help?
Brian
-- Brian DeRocher