[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#976684: regression: leela-zero fails to start with current Mesa, but previously worked



On 7.12.2020 0.31, Ximin Luo wrote:
Package: mesa-opencl-icd
Version: 20.2.3-1
Severity: normal
Control: affects -1 leela-zero

Dear Maintainer,

The current version of mesa-opencl-icd does not work with Leela-Zero; clinfo is fine:

$ CLOVER_DEBUG=llvm,native CLOVER_DEBUG_FILE=dump-file OCL_ICD_VENDORS=libMesaOpenCL.so.1 clinfo -l
Platform #0: Clover
  `-- Device #0: Radeon RX 580 Series (POLARIS10, DRM 3.39.0, 5.9.0-4-amd64, LLVM 11.0.0)

$ CLOVER_DEBUG=llvm,native CLOVER_DEBUG_FILE=dump-file OCL_ICD_VENDORS=libMesaOpenCL.so.1 leelaz
[..]
Selected device: Radeon RX 580 Series (POLARIS10, DRM 3.39.0, 5.9.0-4-amd64, LLVM 11.0.0)
with OpenCL 1.1 capability.
Half precision compute support: No.
Tensor Core support: No.

Started OpenCL SGEMM tuner.
Will try 290 valid configurations.
Failed to compile: 290 kernels.
Failed to find a working configuration.
Check your OpenCL drivers.
Minimum error: 100.000000. Error bound: 0.000100

Started OpenCL SGEMM tuner.
Will try 290 valid configurations.
Failed to compile: 290 kernels.
Failed to find a working configuration.
Check your OpenCL drivers.
Minimum error: 100.000000. Error bound: 0.100000
Both single precision and half precision failed to run.
terminate called after throwing an instance of 'std::runtime_error'
   what():  Failed to initialize net.
Aborted
exit code 134

I think Mesa is the problem and not libclc; I downgraded to a previously-working version
of libclc (0.2.0+git20180917-3) but with current Mesa (20.2.3-1) and this bug was still
present. I did not try downgrading Mesa (or upgrading it to the version in experimental)
as that is a much more invasive change to my system, I hope you can appreciate.

Despite the presence of CLOVER_DEBUG=llvm,native CLOVER_DEBUG_FILE=dump-file there were
no dump-files dumped by the above leelaz command, maybe you can deduce something from that.
Let me know if I can run anything else to provide further information.

For the time being I am using the proprietary AMD OpenCL lib as described in #976295.

Ximin

What was the previous version of mesa? Regressions should be filed upstream too.



--
t


Reply to: