[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#990279: 9a89a721b41b (" drm/amdgpu: check alignment on CPU page for bo map") breaks amdgpu on ppc64 machines?



On Sun, 2021-10-10 at 14:46 +0100, Nathaniel Filardo wrote:
> It occurs to me, quite belatedly, that it may be worth asking the
> author, reviewers, and signers of the change in question their
> thoughts on this bug report:
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=990279
> 
> In particular, on ppc64 systems, Linux typically is configured to use
> a 64KiB page (i.e., shift 16) rather than 4KiB (shift 12) page.  It
> looks, however, that AMDGPU_GPU_PAGE_SIZE is always 4096, and so
> something (perhaps in userspace, even, eek?) is requesting
> 4KiB-but-not-64KiB alignment of this buffer.

Christian told me the buffer should be aligned to *CPU* page boundary,
or the page table in AMDGPU driver will be corrupted:

> the value of num_entries must always be a multiple of 
> AMDGPU_GPU_PAGES_IN_CPU_PAGE or otherwise we corrupt the page tables.

> You need to identify the root cause of this, most likely start or last
> are not a multiple of AMDGPU_GPU_PAGES_IN_CPU_PAGE.

IMO f4d3da72a76a9ce5f57bba64788931686a9dc333 "drm/amdgpu: Set a suitable
dev_info.gart_page_size" should be backported along with this, which
makes the kernel to provide the CPU page size to libdrm and mesa and
correct userspace behavior.  I'm not sure why only one is backported.
-- 
Xi Ruoyao <xry111@mengyan1223.wang>
School of Aerospace Science and Technology, Xidian University


Reply to: