[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Debian HPC with GPUs


I am using Debian to do HPC with AMD GPUs (Radeon model).

The system setup is as follows:

- Debian Testing distribution
- firmware-amd-graphics package
- AMD GPU proprietary driver
- Clang and LLVM packages

When I use the GPUs to do computation, I get random errors like the following:

amdgpu_job_timedout .... sdma0 ring ...
amdgpu_job_timedout .... sdma1 ring ...

I have set up the following parameters in amdgpu.conf

pcie_gen2=0 audio=0 exp_hw_support=1

still I am getting random errors, but the hardware is in good shape.

This error is present in 2 separate hardware systems.

Thanks for any possible help.

Valerio Bellizzomi

Reply to: