[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Limitation on pkg Pyfai ERROR



Hi Clement,

I am the upstream author of pyFAI and probably this bug has little to do with debian packaging.
Indeed, I don't test pyFAI on AMD hardware regularly.

Can you run the tests in a more verbose way to now specifically which test is failing ?
Maybe we should follow up this discussion in a pyFAI issue:
https://github.com/silx-kit/pyFAI/issues/2584

Could it be that is is `pyFAI-benchmark -h` which fails ?

I also find surprizing that your card advertizes WG=1024 abut that there are limitation to WG=256 in the output... but maybe this is unrelated.

Cheers,

Jerome

On Mon, 28 Jul 2025 13:58:08 +0000
LONGEAC Clement <clement.longeac@synchrotron-soleil.fr> wrote:

> Hello,
> 
> I work on the pkg Pyfai , I am in an internship at Synchrotron-Soleil , my directors are Frederic-
> Emmanuel PICCA and Emmanuel FARHI. I implemented Rocm and Pocl autopkgtest for
> architecture amd64 and arm64. I implemented autopktests for rocm and pocl using Opencl on the package named Pyfai
> on local. The aim is to have an overview of code compatibility with various AMD
> graphics cards, the codes on all the AMD boards available for CI rocm for GPU
> and Pocl for CPU.
> 
> But I have several problems with it, it makes very long times to build , so that some tests are marked as timed out whatever I do.
> I implemented the time limit at 42 200 second . In the ROCm parts , I have the error "Maximum valid workgroup size 256 on device <pyopencl.Device 'gfx1034' on 'AMD Accelerated Parallel Processing' at 0xe90bf90> 0.0 1.871411379818157e-05 "
> 
> I don't know how to solve that and what it come from ... I made a lot of research and I don't really know how to solve it.
> It seems to be material , to solve it we must have a GPU AMD marked as PRO , not a gaming graphic card.
> 
>  Our config :
>  *******
> Agent 2
> *******
>   Name:                    gfx1034
>   Uuid:                    GPU-XX
>   Marketing Name:          AMD Radeon RX 6400
>   Vendor Name:             AMD
>   Feature:                 KERNEL_DISPATCH
>   Profile:                 BASE_PROFILE
>   Float Round Mode:        NEAR
>   Max Queue Number:        128(0x80)
>   Queue Min Size:          64(0x40)
>   Queue Max Size:          131072(0x20000)
>   Queue Type:              MULTI
>   Node:                    1
>   Device Type:             GPU
>   Cache Info:
>     L1:                      16(0x10) KB
>     L2:                      1024(0x400) KB
>     L3:                      16384(0x4000) KB
>   Chip ID:                 29759(0x743f)
>   ASIC Revision:           0(0x0)
>   Cacheline Size:          128(0x80)
>   Max Clock Freq. (MHz):   2320
>   BDFID:                   20224
>   Internal Node ID:        1
>   Compute Unit:            12
>   SIMDs per CU:            2
>   Shader Engines:          1
>   Shader Arrs. per Eng.:   2
>   WatchPts on Addr. Ranges:4
>   Coherent Host Access:    FALSE
>   Features:                KERNEL_DISPATCH
>   Fast F16 Operation:      TRUE
>   Wavefront Size:          32(0x20)
>   Workgroup Max Size:      1024(0x400)
>   Workgroup Max Size per Dimension:
>     x                        1024(0x400)
>     y                        1024(0x400)
>     z                        1024(0x400)
>   Max Waves Per CU:        32(0x20)
>   Max Work-item Per CU:    1024(0x400)
>   Grid Max Size:           4294967295(0xffffffff)
>   Grid Max Size per Dimension:
>     x                        4294967295(0xffffffff)
>     y                        4294967295(0xffffffff)
>     z                        4294967295(0xffffffff)
>   Max fbarriers/Workgrp:   32
>   Packet Processor uCode:: 129
>   SDMA engine uCode::      34
>   IOMMU Support::          None
>   Pool Info:
>     Pool 1
>       Segment:                 GLOBAL; FLAGS: COARSE GRAINED
>       Size:                    4177920(0x3fc000) KB
>       Allocatable:             TRUE
>       Alloc Granule:           4KB
>       Alloc Recommended Granule:2048KB
>       Alloc Alignment:         4KB
>       Accessible by all:       FALSE
>     Pool 2
>       Segment:                 GLOBAL; FLAGS: EXTENDED FINE GRAINED
>       Size:                    4177920(0x3fc000) KB
>       Allocatable:             TRUE
>       Alloc Granule:           4KB
>       Alloc Recommended Granule:2048KB
>       Alloc Alignment:         4KB
>       Accessible by all:       FALSE
>     Pool 3
>       Segment:                 GROUP
>       Size:                    64(0x40) KB
>       Allocatable:             FALSE
>       Alloc Granule:           0KB
>       Alloc Recommended Granule:0KB
>       Alloc Alignment:         0KB
>       Accessible by all:       FALSE
>   ISA Info:
>     ISA 1
>       Name:                    amdgcn-amd-amdhsa--gfx1034
>       Machine Models:          HSA_MACHINE_MODEL_LARGE
>       Profiles:                HSA_PROFILE_BASE
>       Default Rounding Mode:   NEAR
>       Default Rounding Mode:   NEAR
>       Fast f16:                TRUE
>       Workgroup Max Size:      1024(0x400)
>       Workgroup Max Size per Dimension:
>         x                        1024(0x400)
>         y                        1024(0x400)
>         z                        1024(0x400)
>       Grid Max Size:           4294967295(0xffffffff)
>       Grid Max Size per Dimension:
>         x                        4294967295(0xffffffff)
>         y                        4294967295(0xffffffff)
>         z                        4294967295(0xffffffff)
>       FBarrier Max Size:       32
> 
> 
> I added Rocm and Pocl tools in debian/tests/control :
> 
> # tests that must pass
> 
> Test-Command: no-opencl
> Architecture: !amd64 !arm64 !armel !armhf !i386
> Depends:
>  bitshuffle,
>  python3-all,
>  python3-pyfai,
>  python3-tk,
>  xauth,
>  xvfb,
>  python3-pyqt5.qtopengl,
>  python3-pyqt5,
>  libgl1-mesa-glx,
> Features: test-name=no-opencl
> Restrictions: allow-stderr, skip-not-installable
> 
> 
> Test-Command: rocm-test-launcher debian/tests/opencl
> Architecture: amd64 arm64 armel armhf i386
> Depends:
>  bitshuffle,
>  clinfo,
>  rocminfo,
>  libnuma1,
>  ocl-icd-libopencl1,
>  rocm-opencl-icd,
>  pkg-rocm-tools,
>  python3-all,
>  python3-pyfai,
>  python3-tk,
>  xauth,
>  xvfb,
>  libclang-common-17-dev,
>  hipcc,
>  rocm-device-libs-17,
> Features: test-name=opencl-rocm
> Restrictions: allow-stderr, skip-not-installable, skippable
> 
> Test-Command: debian/tests/opencl
> Architecture: amd64 arm64 armel armhf i386
> Depends:
>  bitshuffle,
>  pocl-opencl-icd,
>  clinfo,
>  python3-all,
>  python3-pyfai,
>  python3-tk,
>  xauth,
>  xvfb,
>  libclang-common-17-dev,
> Features: test-name=opencl-pocl
> Restrictions: allow-stderr, skip-not-installable
> 
> 
> Test-Command: xvfb-run -s "-screen 0 1024x768x24 -ac +extension GLX +render -noreset" sh debian/tests/gui
> Depends:
>  debhelper,
>  mesa-utils,
>  @,
>  xauth,
>  xvfb,
> Restrictions: allow-stderr
> 
> And the file : debian/tests/opencl :
> 
> #!/bin/sh -e
> 
> # Check that OpenCL isn't totally broken (note that it isn't totally working either)
> # Uses device 0 platform 0, i.e. to use a real GPU manually install its opencl-icd before running this
> # Mark the test has flaky, the important part is the CPU computation.
> 
> export PYFAI_OPENCL=True
> export PYOPENCL_COMPILER_OUTPUT=1
> 
> # skip test
> # TestAzimHalfFrelon.test_medfilt1d
> 
> cp bootstrap.py run_tests.py pyproject.toml version.py README.rst "$AUTOPKGTEST_TMP"
> 
> for py in $(py3versions -s 2>/dev/null)
> do cd "$AUTOPKGTEST_TMP"
>    echo "Testing with $py:"
>    xvfb-run -a --server-args="-screen 0 1024x768x24" $py run_tests.py -v -m --low-mem --installed
> done
> 
> The error log for ROCm part:
> 
> When the autopkgtest for rocm is launched, I get this error at the end. Where does this come from?
> 
> INFO:memProf:Time: 60.074s RAM: 0.000 Mb pyFAI.test.test_containers.TestContainer.test_rebin1d ====================================================================== FAIL: testPyfaiBenchmark (pyFAI.test.test_scripts.TestScriptsHelp.testPyfaiBenchmark) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib/python3/dist-packages/pyFAI/test/test_scripts.py", line 105, in testPyfaiBenchmark self.executeAppHelp("pyFAI-benchmark", "pyFAI.app.benchmark") ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/pyFAI/test/test_scripts.py", line 86, in executeAppHelp self.executeCommandLine(command_line, env) ~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/pyFAI/test/test_scripts.py", line 79, in executeCommandLine self.assertEqual(p.returncode, 0) ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^ AssertionError: 1 != 0 ---------------------------------------------------------------------- Ran 453 tests in 5584.067s FAILED (failures=1, skipped=95) Maximum valid workgroup size 256 on device <pyopencl.Device 'gfx1034' on 'AMD Accelerated Parallel Processing' at 0xe90bf90> 0.0 1.871411379818157e-05 autopkgtest [18:23:38]: test opencl-rocm: -----------------------] autopkgtest [18:23:38]: test opencl-rocm: - - - - - - - - - - results - - - - - - - - - - opencl-rocm FAIL non-zero exit status 1
> 
> Thank you very much
> Clément LONGEAC
> 
> 


-- 
Jérôme Kieffer
tel +33 476 882 445


Reply to: