[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: HIP on the NVIDIA platform



Hi Cordell,

On Thu, 2022-10-27 at 16:10 -0600, Cordell Bloor wrote:
> 
> Thanks for your perspective. That was very informative.
> 
> I did a brief review of the source and I believe it's likely that
> they 
> are ABI compatible. However, ABI compatibility between the AMD and 
> NVIDIA platform variants is not guaranteed by the upstream project.

I think there is a typo somewhere. Did you mean they are compatible
in terms of API? And since they use different underlying structures,
they result in different ABIs after compilation.

> The problem is that on the NVIDIA platform, the HIP types used in the
> API are typedefs for NVIDIA types. If ABI compatibility were
> guaranteed, 
> then the other platform variants would be constrained to follow
> NVIDIA's 
> ABI and any changes they make to it. That could be difficult with two
> platform variants, but it might become impossible if there were three
> or 
> more. So, it's not clear to me if ABI compatibility can realistically
> be 
> guaranteed between platforms while using native CUDA types on NVIDIA.
> 
> One alternative being considered upstream is providing a header-only 
> implementation for the NVIDIA platform. I was initially concerned
> about 
> that approach because it would preclude the use of those libraries 
> through FFI on the NVIDIA platform. Nevertheless, I'm starting to
> come 
> around to the idea.

This reminds me of one of intel's libraries named MKL. It supports
various envvars including MKL_THREADINDG_LAYER which can use used
to switch the threading layer among GOMP (GNU's openmp), IOMP
(intel/llvm's openmp), and TBB (intel's onetbb). All of them seems
to be called through FFI.

An potential advantage of such manner is that, say, if we have two
acceleration backends, where one of them is non-free but the other
is free. Then this package itself can enter the main section, because
(1) it does not rely on non-free software to be functional.
(2) it is not linked against non-free library.

But indeed, writing a library in this manner is difficult.

> It's a difficult problem. :(
> 
> Sincerely,
> Cory Bloor
> 



Reply to: