Hi Cory, Cordell Bloor, on 2022-06-14: > On 2022-06-14 14:41, Étienne Mollier wrote: > > I'm at a point where I manage to build rocrand packages, but I > > still struggle against googletest at the moment (main issue is > > to make the build procedure recognize the locally installed one > > in /usr/src/googletest, otherwise my build chokes on attempting > > Internet access). I thought I was missing GTEST_ROOT setting, > > but this is not sufficient; I will continue tomorrow. > > You don't need to set any flags or variables. You've just installed the > wrong package. The one you want is libgtest-dev. Ah, silly me, thanks for the tip, this triggered the build of the test suite and I could move forward. :) > I would also recommend adding --no-parallel to the dh_auto_test arguments. > It's not really that important for rocrand, but I would just do that by > default for any library that uses the GPU. It might take a little bit > longer, but it will result in more reliable tests. Done on rocm-hipamd and rocrand to make sure several tests are not colliding at once on a single gpu. Overall this should make things more stable, I agree. After triggering the test suite of rocrand, I see most tests failing with the following error messages show, e.g. test 23: 23: Test command: /<<PKGBUILDDIR>>/obj-x86_64-linux-gnu/test/test_rocrand_xorwow_prng 23: Test timeout computed to be: 10000000 23: Running main() from ./googletest/src/gtest_main.cc 23: [==========] Running 8 tests from 1 test suite. 23: [----------] Global test environment set-up. 23: [----------] 8 tests from rocrand_xorwow_prng_tests 23: [ RUN ] rocrand_xorwow_prng_tests.init_test 23: LoadLib(libhsa-amd-aqlprofile64.so) failed: libhsa-amd-aqlprofile64.so: cannot open shared object file: No such file or directory 23: "hipErrorNoBinaryForGpu: Unable to find code object for all current devices! " 23/29 Test #23: test_rocrand_xorwow_prng ............Subprocess aborted***Exception: 0.09 sec In past discussion, I understood the libhsa-amd-aqlprofile64.so should be benign (or should otherwise be skipped), so I believe the issue mainly results from the hipErrorNoBinaryForGpu. Full list of failing tests at t time: 4 - test_rocrand_basic (Subprocess aborted) 5 - test_rocrand_cpp_wrapper (Subprocess aborted) 6 - test_rocrand_generate (Subprocess aborted) 7 - test_rocrand_generate_log_normal (Subprocess aborted) 8 - test_rocrand_generate_normal (Subprocess aborted) 9 - test_rocrand_generate_poisson (Subprocess aborted) 10 - test_rocrand_generate_uniform (Subprocess aborted) 11 - test_rocrand_generator_type (Subprocess aborted) 12 - test_rocrand_kernel_mrg32k3a (Subprocess aborted) 13 - test_rocrand_kernel_mtgp32 (Subprocess aborted) 14 - test_rocrand_kernel_philox4x32_10 (Subprocess aborted) 15 - test_rocrand_kernel_sobol32 (Subprocess aborted) 16 - test_rocrand_kernel_sobol64 (Subprocess aborted) 17 - test_rocrand_kernel_xorwow (Subprocess aborted) 18 - test_rocrand_mrg32k3a_prng (Subprocess aborted) 19 - test_rocrand_mtgp32_prng (Subprocess aborted) 20 - test_rocrand_philox_prng (Subprocess aborted) 21 - test_rocrand_sobol32_qrng (Subprocess aborted) 22 - test_rocrand_sobol64_qrng (Subprocess aborted) 23 - test_rocrand_xorwow_prng (Subprocess aborted) 25 - test_hiprand_api (Subprocess aborted) 26 - test_hiprand_cpp_wrapper (Subprocess aborted) 27 - test_hiprand_kernel (Subprocess aborted) I tried various things in rocrand and rocm-hipamd to attempt to enable the gfx803 architecture, as I was under the impression that the existing packaging was mainly targetting gfx906, but subsequent build attempts failed with: hipcc-cmd: /usr/bin/clang++-14 -std=c++11 -isystem "/usr/lib/llvm-14/lib/clang/14.0.5/include/.." -Xclang -fallow-half-arguments-and-returns -D__HIP_HCC_COMPAT_MODE__=1 -isystem /usr/hsa/include --offload-arch='gfx803:xnack-' -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false --hip-device-lib-path="/usr/lib/x86_64-linux-gnu/amdgcn/bitcode" -fhip-new-launch-api '--hip-version=5.0.0' -g -O2 -ffile-prefix-map=/<<PKGBUILDDIR>>=. -specs=/usr/share/dpkg/no-pie-compile.specs -Wdate-time -o "CMakeFiles/cmTC_db3da.dir/testCXXCompiler.cxx.o" -c -x hip /<<PKGBUILDDIR>>/obj-x86_64-linux-gnu/CMakeFiles/CMakeTmp/testCXXCompiler.cxx clang: error: invalid target ID 'gfx803:xnack-'; format is a processor name followed by an optional colon-delimited list of features followed by an enable/disable sign (e.g., 'gfx908:sramecc+:xnack-') which sounds rather odd since the format specified in the --offload-arch argument looks to match the textual specification. But I believe that is a dead end. I also tried adjusting rocr-runtime, but rolled back my changes regarding aqlprofile as they had no effects. I suppose I must have missed something somewhere, but that's my status for the moment, in case you have an idea. Have a nice day, :) -- Étienne Mollier <emollier@emlwks999.eu> Fingerprint: 8f91 b227 c7d6 f2b1 948c 8236 793c f67e 8f0d 11da Sent from /dev/pts/3, please excuse my verbosity. On air: Orion Dust - CXXVI
Attachment:
signature.asc
Description: PGP signature