[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: status of hipcc? and pytorch-rocm



On Sat, 2023-03-04 at 09:00 -0700, Cordell Bloor wrote:
> 
> 
> Yes, hipcc is usable now. If you set CXX=hipcc, your HIP code should 
> build fine.
> 

Awesome!

> > And is there any missing standard rocm/hip libraries to be packaged?
> 
> Yes. The remaining libraries needed for PyTorch are:
> 
> 1. hipfft
> Upstream: https://github.com/ROCmSoftwarePlatform/hipFFT
> Salsa: https://salsa.debian.org/rocm-team/hipfft
> Status: Waiting for rocfft to pass through NEW.
> 
> 2. rocblas
> Upstream: https://github.com/ROCmSoftwarePlatform/rocBLAS
> Salsa: https://salsa.debian.org/rocm-team/rocblas
> Status: Packaging in progress. Significant work remains. It will require 
> patches to conform to Debian guidelines.
> 
> 3. rccl
> Upstream: https://github.com/ROCmSoftwarePlatform/rccl
> Salsa: https://salsa.debian.org/rocm-team/rccl
> Status: Packaging in progress. Needs a d/copyright file and then it can 
> be submitted to NEW.
> 
> 4. miopen
> Upstream: https://github.com/ROCmSoftwarePlatform/MIOpen
> Status: Not started. Blocked by rocblas. I know almost nothing about 
> this library, and I expect it will be quite complex.

The cuda counterpart of miopen is nvidia-cudnn (I have packaged it
with a downloader script). You can think of it as a computational kernel
library specifically designed for neural networks. It's kind of very
specialized BLAS for neural networks. While the computations kernels
provided in cuBLAS/rocBLAS are generally useful for a wider range
of applications.

> 5. roctracer
> Upstream: https://github.com/ROCm-Developer-Tools/roctracer
> Salsa: https://salsa.debian.org/rocm-team/roctracer
> Status: Not started. Not blocked by anything. I know almost nothing 
> about this library.

Looks like a library for performance profiling.

> 
> For more details, see "The Road to PyTorch and Tensorflow on ROCm" [1].

This is a very detailed reference, thanks!
BTW, pytorch is marked as 4 stars for packaging difficulty, which is the same
as tensorflow.

However, in the context of Debian, the packaging difficulty of tensorflow
is higher than pytorch because the only build system of tensorflow is bazel.
Tensorflow should be at least 5 stars.

> > If things are almost ready, I think I can try to compile the pytorch-rocm
> > variant soon. It will be a good reverse dependency for testing its functionality.
> 
> My estimate would be that it will take several months for all 
> dependencies to be fulfilled. For the moment, I've shifted my focus to 
> filing and fixing bugs in the packages that are going into Bookworm.

OK. I'll continue sorting out the remaining issues from the pytorch side.

> Sincerely,
> Cory Bloor
> 
> [1]: https://lists.debian.org/debian-ai/2022/09/msg00029.html
> 



Reply to: