Re: gfx1100: first successes with pass-through
On 2024-08-20 23:27, Christian Kastner wrote:
> I can't see anything obvious in the 6.7 changelog [1], but there are
> numerous memory management changes and the "Bad address" thing
> seems to be related to page faulting in some way, from what I found when
> searching the web for this message.
>
> I did not yet investigate further and don't know when I'll have the next
> chance to do that, but if anyone beats me to it, I assume that bisecting
> 6.6 to 6.7 will reveal the root cause. Who knows, it might be a small
> thing.
I investigated this further. The hang itself results from some kvm ioctl
on the host running into EFAULT.
This was unhandled in QEMU up to 9.0.
In QEMU 9.1, a conditional handling was introduced. I tested this with
9.1 (which entered unstable last week) but sadly this didn't fix it; I
can only assume that said condition was not the one we are running into.
As this kvm ioctl failure is on the host and 9.1 added code for handling
one failure case, I think it's possible that we might just have
discovered another case that needs to be handled, so I reported this
issue upstream [2] and asked for guidance in debugging this.
QEMU has an immense amount of tracepoints, but the obvious ones didn't
produce anything interesting.
Best,
Christian
[1]: https://gitlab.com/qemu-project/qemu/-/commit/c15e5684071d93174e446be318f49d8d59b15d6d
[2]: https://gitlab.com/qemu-project/qemu/-/issues/2574
Reply to: