Re: ci.rocm: invalid device function
Samuel Thibault, le mar. 22 juil. 2025 16:52:13 +0200, a ecrit:
> I have tried ci.rocm for the starpu package, and it fails, I don't
> understand why:
>
> https://ci.rocm.debian.net/packages/s/starpu/unstable/amd64+gfx1102/80714/
>
> [starpu][7ed4aa7b79e4][starpu_hip_report_error] Error: oops in hip_codelet (../../examples/basic_examples/block_hip.hip:46)... 98: invalid device function
>
> Usually this error appears when the code is compiled for a too recent
> architecture and run on an older one. In this case, this was built like
> this:
>
> https://buildd.debian.org/status/fetch.php?pkg=starpu&arch=amd64&ver=1.4.7%2Bdfsg-2%2Bexp2&stamp=1753133065&raw=0
>
> /usr/bin/hipcc ../../examples/basic_examples/block_hip.hip -c -o basic_examples/block_hip.o -D__HIP_PLATFORM_HCC__= -D__HIP_PLATFORM_AMD__= -I/usr/include -I/usr/lib/llvm-17/lib/clang/17 -L/usr/lib -DSTARPU_HIP_PLATFORM_AMD -g -I../include -I../../include/ -I../src -I../../src/
>
> Do you have an idea of what could be wrong?
And the code is quite trivial:
https://salsa.debian.org/debian/starpu/-/blob/master/examples/basic_examples/block_hip.hip?ref_type=heads
static __global__ void hip_block(float *block, int nx, int ny, int nz, unsigned ldy, unsigned ldz, float multiplier)
{
int i, j, k;
for(k=0; k<nz ; k++)
{
for(j=0; j<ny ; j++)
{
for(i=0; i<nx ; i++)
block[(k*ldz)+(j*ldy)+i] *= multiplier;
}
}
}
extern "C" void hip_codelet(void *descr[], void *_args)
{
float *block = (float *)STARPU_BLOCK_GET_PTR(descr[0]);
int nx = STARPU_BLOCK_GET_NX(descr[0]);
int ny = STARPU_BLOCK_GET_NY(descr[0]);
int nz = STARPU_BLOCK_GET_NZ(descr[0]);
unsigned ldy = STARPU_BLOCK_GET_LDY(descr[0]);
unsigned ldz = STARPU_BLOCK_GET_LDZ(descr[0]);
float *multiplier = (float *)_args;
hipLaunchKernelGGL(hip_block, 1, 1, 0, 0,
block, nx, ny, nz,
ldy, ldz, *multiplier);
hipError_t status = hipGetLastError();
if (status != hipSuccess) STARPU_HIP_REPORT_ERROR(status);
}
Samuel
Reply to: