[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Question about how to handle HIP vs hipamd



Hi Cory,

> On May 18, 2022 3:06:56 a.m. EDT, Cordell Bloor <cgmb-deb@slerp.xyz> wrote:
> > I ran the tests on my Radeon VII workstation with Debian in docker on an
> > Ubuntu 20.04 host and the AMDGPU kernel module. My list of failures is much
> > shorter. In the next few weeks, I can set up a native install of Debian and
> > test more thoroughly.

Thanks for your tests already, I believe it's interesting to get
an idea of differences between the amdgpu driver straight from
AMD's amdgpu-dkms (IIUC), and the one provided in the Linux
kernel in Debian.  Same as llvm, it would be preferable to avoid
duplicating the driver if it is already provided by Linux.

> > 98% tests passed, 10 tests failed out of 408
> > 
> > Total Test time (real) = 1501.40 sec
> > 
> > The following tests did not run:
> >      96 - directed_tests/g++/hipMalloc_cxx_amd.tst (Skipped)
> > 
> > The following tests FAILED:
> >      99 - directed_tests/hiprtc/hiprtcGetLoweredName.tst (SEGFAULT)
> >     100 - directed_tests/hiprtc/saxpy.tst (SEGFAULT)
> >     101 - directed_tests/ipc/hipMultiProcIpcEvent.tst (Timeout)
> >     102 - directed_tests/ipc/hipMultiProcIpcMem.tst (Timeout)
> >     196 - directed_tests/runtimeApi/memory/hipHostRegister.tst (Subprocess aborted)
> >     213 - directed_tests/runtimeApi/memory/hipMemcpy-dev-offsets.tst (Subprocess aborted)
> >     214 - directed_tests/runtimeApi/memory/hipMemcpy-host-offsets.tst (Subprocess aborted)
> >     290 - directed_tests/runtimeApi/module/hipExtLaunchKernelGGL_KernelExeTime.tst (Subprocess aborted)
> >     294 - directed_tests/runtimeApi/module/hipExtModuleLaunchKernel_KernelExecutionTime.tst (Subprocess aborted)
> >     391 - directed_tests/runtimeApi/stream/hipStreamCreateWithPriority.tst (Failed)
> > 
> > With that being said, I'm most interested in whether rocm-hipamd can build
> > the math libraries and run their test suites. Passing those would be proof
> > to me that all the basic HIP features are working sufficiently to do useful
> > work.

Ack, I suppose the rocm-hipamd packaging needs to be made
suitable for inclusion in the archive soon so to ease the
testing of subsequent mathematical libraries.  I think I'm
half-way through the copyright review.  On the test suite side,
I'll keep building it, but running does not seem too relevant on
CPU only, and when a GPU is available, it may break the testbed.

> > On 2022-05-17 14:33, Étienne Mollier wrote:
> > > The two segmentation faults in tests #99
> > > and #100 might be of concern, I caught the following around
> > > these, so maybe a library to get packaged in Debian too:
> > > 
> > > 	99: LoadLib(libhsa-amd-aqlprofile64.so) failed: libhsa-amd-aqlprofile64.so: cannot open shared object file: No such file or directory
> > 
> > That library adds HSA extensions for performance profiling. Unfortunately,
> > it is proprietary. The last thing I heard about it was from John Bridgman
> > [1].

Jeremy Newton, on 2022-05-18:
> Yes aqlprofile is nonfree. They're working on open sourcing it, with no set timeline.
> 
> Last I heard (a few months ago), there's a few headers that need relicensing, but there's some sort of patent or copyright concern that they're trying to work through.
> 
> I suggest skipping or ignoring any tests around aqlprofile for now.

Thank you both for the details, if this component is optional
and also proprietary, then I'll keep that item low in my
priorities for the moment.

> > [1]: https://www.phoronix.com/forums/forum/linux-graphics-x-org-drivers/open-source-amd-linux/1003883-gpuvm-discrete-gpu-code-for-amdkfd-radeon-compute-could-be-ready-for-linux-4-17?p=1004038#post1004038
> > [2]: https://bugs.gentoo.org/716948
> > [3]: https://gitweb.gentoo.org/repo/gentoo.git/tree/dev-libs/rocr-runtime/files/rocr-runtime-4.3.0_no-sqlprofiler.patch
> > [4]: https://github.com/spack/spack/blob/develop/var/spack/repos/builtin/packages/hip/package.py

Note I will be at Hamburg in the upcoming week, and may take
longer to answer if I'm not focusing on ROCm; I don't expect to
have access to AMD GPU hardware during that time anyway.

Have a nice day,  :)
-- 
Étienne Mollier <emollier@emlwks999.eu>
Fingerprint:  8f91 b227 c7d6 f2b1 948c  8236 793c f67e 8f0d 11da
Sent from /dev/tty1, please excuse my verbosity.
On air: Piotr "JazzCat" Pacyna - Metropolice

Attachment: signature.asc
Description: PGP signature


Reply to: