[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: ci.rocm: invalid device function



Hi Samuel,

On 2025-07-22 08:52, Samuel Thibault wrote:
[starpu][7ed4aa7b79e4][starpu_hip_report_error] Error: oops in hip_codelet (../../examples/basic_examples/block_hip.hip:46)... 98: invalid device function

Usually this error appears when the code is compiled for a too recent
architecture and run on an older one. [....]

/usr/bin/hipcc ../../examples/basic_examples/block_hip.hip -c -o basic_examples/block_hip.o -D__HIP_PLATFORM_HCC__= -D__HIP_PLATFORM_AMD__= -I/usr/include -I/usr/lib/llvm-17/lib/clang/17 -L/usr/lib -DSTARPU_HIP_PLATFORM_AMD -g -I../include -I../../include/ -I../src -I../../src/

Do you have an idea of what could be wrong?

I believe the problem is that you have not specified your target GPU architectures. It is probably building for a the default of gfx906. My gfx906 test machine was sleeping, so I booted it up. You can see that it passes [1]. You will need to add `--offload-arch <gfxid1> --offload-arch <gfx2> ...`.

The values of `<gfx1> <gfx2> ...` can be provided by rocm-target-arch from pkg-rocm-tools (0.9.0~exp2).

Sincerely,
Cory Bloor

[1]: https://ci.rocm.debian.net/packages/s/starpu/unstable/amd64+gfx906/80699/


Reply to: