Re: Limitation on pkg Pyfai ERROR
Hi Clement,
I am the upstream author of pyFAI and probably this bug has little to do with debian packaging.
Indeed, I don't test pyFAI on AMD hardware regularly.
Can you run the tests in a more verbose way to now specifically which test is failing ?
Maybe we should follow up this discussion in a pyFAI issue:
https://github.com/silx-kit/pyFAI/issues/2584
Could it be that is is `pyFAI-benchmark -h` which fails ?
I also find surprizing that your card advertizes WG=1024 abut that there are limitation to WG=256 in the output... but maybe this is unrelated.
Cheers,
Jerome
On Mon, 28 Jul 2025 13:58:08 +0000
LONGEAC Clement <clement.longeac@synchrotron-soleil.fr> wrote:
> Hello,
>
> I work on the pkg Pyfai , I am in an internship at Synchrotron-Soleil , my directors are Frederic-
> Emmanuel PICCA and Emmanuel FARHI. I implemented Rocm and Pocl autopkgtest for
> architecture amd64 and arm64. I implemented autopktests for rocm and pocl using Opencl on the package named Pyfai
> on local. The aim is to have an overview of code compatibility with various AMD
> graphics cards, the codes on all the AMD boards available for CI rocm for GPU
> and Pocl for CPU.
>
> But I have several problems with it, it makes very long times to build , so that some tests are marked as timed out whatever I do.
> I implemented the time limit at 42 200 second . In the ROCm parts , I have the error "Maximum valid workgroup size 256 on device <pyopencl.Device 'gfx1034' on 'AMD Accelerated Parallel Processing' at 0xe90bf90> 0.0 1.871411379818157e-05 "
>
> I don't know how to solve that and what it come from ... I made a lot of research and I don't really know how to solve it.
> It seems to be material , to solve it we must have a GPU AMD marked as PRO , not a gaming graphic card.
>
> Our config :
> *******
> Agent 2
> *******
> Name: gfx1034
> Uuid: GPU-XX
> Marketing Name: AMD Radeon RX 6400
> Vendor Name: AMD
> Feature: KERNEL_DISPATCH
> Profile: BASE_PROFILE
> Float Round Mode: NEAR
> Max Queue Number: 128(0x80)
> Queue Min Size: 64(0x40)
> Queue Max Size: 131072(0x20000)
> Queue Type: MULTI
> Node: 1
> Device Type: GPU
> Cache Info:
> L1: 16(0x10) KB
> L2: 1024(0x400) KB
> L3: 16384(0x4000) KB
> Chip ID: 29759(0x743f)
> ASIC Revision: 0(0x0)
> Cacheline Size: 128(0x80)
> Max Clock Freq. (MHz): 2320
> BDFID: 20224
> Internal Node ID: 1
> Compute Unit: 12
> SIMDs per CU: 2
> Shader Engines: 1
> Shader Arrs. per Eng.: 2
> WatchPts on Addr. Ranges:4
> Coherent Host Access: FALSE
> Features: KERNEL_DISPATCH
> Fast F16 Operation: TRUE
> Wavefront Size: 32(0x20)
> Workgroup Max Size: 1024(0x400)
> Workgroup Max Size per Dimension:
> x 1024(0x400)
> y 1024(0x400)
> z 1024(0x400)
> Max Waves Per CU: 32(0x20)
> Max Work-item Per CU: 1024(0x400)
> Grid Max Size: 4294967295(0xffffffff)
> Grid Max Size per Dimension:
> x 4294967295(0xffffffff)
> y 4294967295(0xffffffff)
> z 4294967295(0xffffffff)
> Max fbarriers/Workgrp: 32
> Packet Processor uCode:: 129
> SDMA engine uCode:: 34
> IOMMU Support:: None
> Pool Info:
> Pool 1
> Segment: GLOBAL; FLAGS: COARSE GRAINED
> Size: 4177920(0x3fc000) KB
> Allocatable: TRUE
> Alloc Granule: 4KB
> Alloc Recommended Granule:2048KB
> Alloc Alignment: 4KB
> Accessible by all: FALSE
> Pool 2
> Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED
> Size: 4177920(0x3fc000) KB
> Allocatable: TRUE
> Alloc Granule: 4KB
> Alloc Recommended Granule:2048KB
> Alloc Alignment: 4KB
> Accessible by all: FALSE
> Pool 3
> Segment: GROUP
> Size: 64(0x40) KB
> Allocatable: FALSE
> Alloc Granule: 0KB
> Alloc Recommended Granule:0KB
> Alloc Alignment: 0KB
> Accessible by all: FALSE
> ISA Info:
> ISA 1
> Name: amdgcn-amd-amdhsa--gfx1034
> Machine Models: HSA_MACHINE_MODEL_LARGE
> Profiles: HSA_PROFILE_BASE
> Default Rounding Mode: NEAR
> Default Rounding Mode: NEAR
> Fast f16: TRUE
> Workgroup Max Size: 1024(0x400)
> Workgroup Max Size per Dimension:
> x 1024(0x400)
> y 1024(0x400)
> z 1024(0x400)
> Grid Max Size: 4294967295(0xffffffff)
> Grid Max Size per Dimension:
> x 4294967295(0xffffffff)
> y 4294967295(0xffffffff)
> z 4294967295(0xffffffff)
> FBarrier Max Size: 32
>
>
> I added Rocm and Pocl tools in debian/tests/control :
>
> # tests that must pass
>
> Test-Command: no-opencl
> Architecture: !amd64 !arm64 !armel !armhf !i386
> Depends:
> bitshuffle,
> python3-all,
> python3-pyfai,
> python3-tk,
> xauth,
> xvfb,
> python3-pyqt5.qtopengl,
> python3-pyqt5,
> libgl1-mesa-glx,
> Features: test-name=no-opencl
> Restrictions: allow-stderr, skip-not-installable
>
>
> Test-Command: rocm-test-launcher debian/tests/opencl
> Architecture: amd64 arm64 armel armhf i386
> Depends:
> bitshuffle,
> clinfo,
> rocminfo,
> libnuma1,
> ocl-icd-libopencl1,
> rocm-opencl-icd,
> pkg-rocm-tools,
> python3-all,
> python3-pyfai,
> python3-tk,
> xauth,
> xvfb,
> libclang-common-17-dev,
> hipcc,
> rocm-device-libs-17,
> Features: test-name=opencl-rocm
> Restrictions: allow-stderr, skip-not-installable, skippable
>
> Test-Command: debian/tests/opencl
> Architecture: amd64 arm64 armel armhf i386
> Depends:
> bitshuffle,
> pocl-opencl-icd,
> clinfo,
> python3-all,
> python3-pyfai,
> python3-tk,
> xauth,
> xvfb,
> libclang-common-17-dev,
> Features: test-name=opencl-pocl
> Restrictions: allow-stderr, skip-not-installable
>
>
> Test-Command: xvfb-run -s "-screen 0 1024x768x24 -ac +extension GLX +render -noreset" sh debian/tests/gui
> Depends:
> debhelper,
> mesa-utils,
> @,
> xauth,
> xvfb,
> Restrictions: allow-stderr
>
> And the file : debian/tests/opencl :
>
> #!/bin/sh -e
>
> # Check that OpenCL isn't totally broken (note that it isn't totally working either)
> # Uses device 0 platform 0, i.e. to use a real GPU manually install its opencl-icd before running this
> # Mark the test has flaky, the important part is the CPU computation.
>
> export PYFAI_OPENCL=True
> export PYOPENCL_COMPILER_OUTPUT=1
>
> # skip test
> # TestAzimHalfFrelon.test_medfilt1d
>
> cp bootstrap.py run_tests.py pyproject.toml version.py README.rst "$AUTOPKGTEST_TMP"
>
> for py in $(py3versions -s 2>/dev/null)
> do cd "$AUTOPKGTEST_TMP"
> echo "Testing with $py:"
> xvfb-run -a --server-args="-screen 0 1024x768x24" $py run_tests.py -v -m --low-mem --installed
> done
>
> The error log for ROCm part:
>
> When the autopkgtest for rocm is launched, I get this error at the end. Where does this come from?
>
> INFO:memProf:Time: 60.074s RAM: 0.000 Mb pyFAI.test.test_containers.TestContainer.test_rebin1d ====================================================================== FAIL: testPyfaiBenchmark (pyFAI.test.test_scripts.TestScriptsHelp.testPyfaiBenchmark) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib/python3/dist-packages/pyFAI/test/test_scripts.py", line 105, in testPyfaiBenchmark self.executeAppHelp("pyFAI-benchmark", "pyFAI.app.benchmark") ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/pyFAI/test/test_scripts.py", line 86, in executeAppHelp self.executeCommandLine(command_line, env) ~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/pyFAI/test/test_scripts.py", line 79, in executeCommandLine self.assertEqual(p.returncode, 0) ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^ AssertionError: 1 != 0 ---------------------------------------------------------------------- Ran 453 tests in 5584.067s FAILED (failures=1, skipped=95) Maximum valid workgroup size 256 on device <pyopencl.Device 'gfx1034' on 'AMD Accelerated Parallel Processing' at 0xe90bf90> 0.0 1.871411379818157e-05 autopkgtest [18:23:38]: test opencl-rocm: -----------------------] autopkgtest [18:23:38]: test opencl-rocm: - - - - - - - - - - results - - - - - - - - - - opencl-rocm FAIL non-zero exit status 1
>
> Thank you very much
> Clément LONGEAC
>
>
--
Jérôme Kieffer
tel +33 476 882 445
Reply to: