[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: ROCm Enabled gloo



The packaging of gloo-rocm is just a template and it does not work.
You can build from the source but it will fall back into the CPU version.
The cmake configuration issues are something you could begin with.

-------------------------------------------------------------------------

CMake Warning at cmake/Hip.cmake:22 (find_package):
  Could not find a configuration file for package "HIP" that is compatible
  with requested version "1.0".
 
  The following configuration files were considered but not accepted:
 
    /usr/lib/aarch64-linux-gnu/cmake/hip/hip-config.cmake, version: 5.7.0
    /lib/aarch64-linux-gnu/cmake/hip/hip-config.cmake, version: 5.7.0
 
Call Stack (most recent call first):
  cmake/Dependencies.cmake:140 (include)
  CMakeLists.txt:111 (include)


On Sat, 2025-06-28 at 20:08 +0000, Utkarsh Raj wrote:
> Hi Spaarsh,
> 
> On Sunday, June 29th, 2025 at 12:57 AM, Spaarsh Thakkar <spaarshthakkar11010@gmail.com> wrote:
> > I spent a little time over the package and have made some progress. The package already has rules[3] and control files for building the package with ROCm (they also have the same for CUDA) but they had been kept in a
> > separate file named control.rocm[4]. I moved the package names from the control.rocm file to the main control file and the package now builds with libgloo-rocm* binaries. They already have the corresponding
> > *.install[5][6] files in place too. I have not had the opportunity to test this new package though.
> 
> I believe "d/control.rocm" and "d/control" are intended to remain separate. Both files list packages that have libraries with identical names and merging them may cause unintended overwrites during the build process. I
> learned this while building for Kokkos. There must be a script named "rocmbuild.sh", which is to be executed before building for ROCm. This script replaces the contents of the main control, copyright files with those of
> control.rocm, copyright.rocm and so on. Hope this helps!
> 
> Sincerely,
> Utkarsh Raj
>  On Sunday, June 29th, 2025 at 12:57 AM, Spaarsh Thakkar <spaarshthakkar11010@gmail.com> wrote:
>  
> >  
> > Greetings to the community!
> > 
> > As part of my GSoC'25[1] work under the mentorship of Cordell Bloor (cc'd), I plan to enable ROCm for gloo[2]. I would like to know if anyone else is also working on the same. If that is the case, then I hope that the
> > following information is useful.
> > 
> > I spent a little time over the package and have made some progress. The package already has rules[3] and control files for building the package with ROCm (they also have the same for CUDA) but they had been kept in a
> > separate file named control.rocm[4]. I moved the package names from the control.rocm file to the main control file and the package now builds with libgloo-rocm* binaries. They already have the corresponding
> > *.install[5][6] files in place too. I have not had the opportunity to test this new package though.
> > 
> > It must be noted that the two dependencies that ROCm enabled gloo needs are hipcc and librccl-dev[7] [8]. The latter is only on the unstable and trixie (testing) branches[9] right now (which explains why this wasn't done
> > earlier despite having the necessary rules and control files in place).
> > 
> > I have already made the changes and pushed them to my gloo fork[10] but I haven't made an MR yet.
> > 
> > Regards,
> > Spaarsh Thakkar
> > 
> > [1]: https://lists.debian.org/debian-ai/2025/05/msg00042.html
> > [2]: https://salsa.debian.org/deeplearning-team/gloo
> > [3]: https://salsa.debian.org/deeplearning-team/gloo/-/blob/master/debian/rules?ref_type=heads#L33-49
> > [4]: https://salsa.debian.org/deeplearning-team/gloo/-/blob/master/debian/control.rocm
> > [5]: https://salsa.debian.org/deeplearning-team/gloo/-/blob/master/debian/libgloo-rocm-0.install
> > [6]: https://salsa.debian.org/deeplearning-team/gloo/-/blob/master/debian/libgloo-rocm-dev.install
> > [7]: https://packages.debian.org/sid/librccl-dev
> > [8]: https://packages.debian.org/trixie/librccl-dev
> > [9]: https://packages.debian.org/search?keywords=librccl-dev
> > [10]: https://salsa.debian.org/Spaarsh/gloo
> >  
> 
>  


Reply to: