[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1120350: trixie-pu: package rocm-hipamd/5.7.1-6+deb13u1



Package: release.debian.org
Severity: normal
Tags: trixie
X-Debbugs-Cc: rocm-hipamd@packages.debian.org,cgmb@slerp.xyz
Control: affects -1 + src:rocm-hipamd
User: release.debian.org@packages.debian.org
Usertags: pu

[ Reason ]

This update fixes #1116585, which can cause a compilation error for
software using bfloat16 functionality on AMD GPUs, which is quite common
with AI software, for example.

The issue was discovered by the reverse dependency src:ggml FTBFSing in
newer versions making use of this functionality.

[ Impact ]

Current versions of ggml cannot be built on trixie.

However, that is just one specific example that we discovered. This
could affect any other software making use of bfloat16.

[ Tests ]

rocm-hipamd itself does not ship tests, but I have tested this fix first
with llama.cpp and ggml in unstable, and then with HEAD versions of
llama.cpp and ggml on trixie.

The compile-time issue is resolved, and I did not encounter any adverse
effects at run-time.

[ Risks ]

This change is pretty minimal. It just adds a patch that marks a handful
of functions as inline.

Upstream has improved upon this more extensively in newer versions, but
those changes would be too significant for a stable update.

[ Checklist ]
  [x] *all* changes are documented in the d/changelog
  [x] I reviewed all changes and I approve them
  [x] attach debdiff against the package in (old)stable
  [x] the issue is verified as fixed in unstable

[ Changes ]

1. Add a patch fixing #1116585, causing compilation errors in other
   software
2. Fix smaller documentation bugs.

[ Other info ]

I've attached the full debdiff, but also the patch in question, to
highlight the only non-documentation change.

Best,
Christian
From: Cordell Bloor <cgmb@debian.org>
Date: Sat, 4 Oct 2025 00:28:49 -0600
Subject: inline bf16 functions

Resolves multiple-definition errors observed when building
libraries and applications:

  ggml: https://github.com/ggml-org/llama.cpp/pull/15296
  mscclpp: https://github.com/microsoft/mscclpp/issues/349

Bug-Debian: https://bugs.debian.org/1116585
---
 hipamd/include/hip/amd_detail/amd_hip_bf16.h | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/hipamd/include/hip/amd_detail/amd_hip_bf16.h b/hipamd/include/hip/amd_detail/amd_hip_bf16.h
index 50b8a18..fd526cf 100644
--- a/hipamd/include/hip/amd_detail/amd_hip_bf16.h
+++ b/hipamd/include/hip/amd_detail/amd_hip_bf16.h
@@ -134,7 +134,7 @@ __HOST_DEVICE__ inline float __bfloat162float(__hip_bfloat16 a) {
  * \ingroup HIP_INTRINSIC_BFLOAT16_CONV
  * \brief Converts float to bfloat16
  */
-__HOST_DEVICE__ __hip_bfloat16 __float2bfloat16(float f) {
+__HOST_DEVICE__ inline __hip_bfloat16 __float2bfloat16(float f) {
   __hip_bfloat16 ret;
   union {
     float fp32;
@@ -178,7 +178,7 @@ __HOST_DEVICE__ __hip_bfloat16 __float2bfloat16(float f) {
  * \ingroup HIP_INTRINSIC_BFLOAT162_CONV
  * \brief Converts and moves bfloat162 to float2
  */
-__HOST_DEVICE__ float2 __bfloat1622float2(const __hip_bfloat162 a) {
+__HOST_DEVICE__ inline float2 __bfloat1622float2(const __hip_bfloat162 a) {
   return float2{__bfloat162float(a.x), __bfloat162float(a.y)};
 }
 
@@ -206,7 +206,7 @@ __device__ unsigned short int __bfloat16_as_ushort(const __hip_bfloat16 h) { ret
  * \ingroup HIP_INTRINSIC_BFLOAT162_CONV
  * \brief Convert double to __hip_bfloat16
  */
-__HOST_DEVICE__ __hip_bfloat16 __double2bfloat16(const double a) {
+__HOST_DEVICE__ inline __hip_bfloat16 __double2bfloat16(const double a) {
   return __float2bfloat16((float)a);
 }
 
@@ -214,7 +214,7 @@ __HOST_DEVICE__ __hip_bfloat16 __double2bfloat16(const double a) {
  * \ingroup HIP_INTRINSIC_BFLOAT162_CONV
  * \brief Convert float2 to __hip_bfloat162
  */
-__HOST_DEVICE__ __hip_bfloat162 __float22bfloat162_rn(const float2 a) {
+__HOST_DEVICE__ inline __hip_bfloat162 __float22bfloat162_rn(const float2 a) {
   return __hip_bfloat162{__float2bfloat16(a.x), __float2bfloat16(a.y)};
 }
 
@@ -244,7 +244,7 @@ __device__ __hip_bfloat162 __high2bfloat162(const __hip_bfloat162 a) {
  * \ingroup HIP_INTRINSIC_BFLOAT162_CONV
  * \brief Converts high 16 bits of __hip_bfloat162 to float and returns the result
  */
-__HOST_DEVICE__ float __high2float(const __hip_bfloat162 a) { return __bfloat162float(a.y); }
+__HOST_DEVICE__ inline float __high2float(const __hip_bfloat162 a) { return __bfloat162float(a.y); }
 
 /**
  * \ingroup HIP_INTRINSIC_BFLOAT162_CONV
@@ -272,7 +272,7 @@ __device__ __hip_bfloat162 __low2bfloat162(const __hip_bfloat162 a) {
  * \ingroup HIP_INTRINSIC_BFLOAT162_CONV
  * \brief Converts low 16 bits of __hip_bfloat162 to float and returns the result
  */
-__HOST_DEVICE__ float __low2float(const __hip_bfloat162 a) { return __bfloat162float(a.x); }
+__HOST_DEVICE__ inline float __low2float(const __hip_bfloat162 a) { return __bfloat162float(a.x); }
 
 /**
  * \ingroup HIP_INTRINSIC_BFLOAT162_CONV
diff -Nru rocm-hipamd-5.7.1/debian/changelog rocm-hipamd-5.7.1/debian/changelog
--- rocm-hipamd-5.7.1/debian/changelog	2025-05-13 13:50:23.000000000 +0200
+++ rocm-hipamd-5.7.1/debian/changelog	2025-10-12 21:41:23.000000000 +0200
@@ -1,3 +1,15 @@
+rocm-hipamd (5.7.1-6+deb13u1) trixie; urgency=medium
+
+  [ Cordell Bloor ]
+  * Add d/p/0041-inline-bf16-functions.patch to mark functions defined
+    in amd_hip_bf16.h as inline. This change prevents multiple
+    definition errors during linking for programs that include
+    <hip/hip_bf16.h> in more than one translation unit (Closes: #1116585)
+  * Fix hipcc manpage title (Closes: #1107681)
+  * Fix spelling error in roc-obj-ls manpage
+
+ -- Christian Kastner <ckk@debian.org>  Sun, 12 Oct 2025 21:41:23 +0200
+
 rocm-hipamd (5.7.1-6) unstable; urgency=medium
 
   * Revert "tests: allow deprecation warnings via allow-stderr restriction"
diff -Nru rocm-hipamd-5.7.1/debian/hipcc.1 rocm-hipamd-5.7.1/debian/hipcc.1
--- rocm-hipamd-5.7.1/debian/hipcc.1	2025-05-13 13:24:13.000000000 +0200
+++ rocm-hipamd-5.7.1/debian/hipcc.1	2025-10-12 21:41:23.000000000 +0200
@@ -1,56 +1,62 @@
-.TH HIPCONFIG "1" "May 2025" "hipcc 5.7.31921" "User Commands"
+.TH HIPCC "1" "May 2025" "hipcc 5.7.31921" "User Commands"
 .SH NAME
 hipcc \- HIP compiler driver
 .SH SYNOPSIS
-usage: hipcc [OPTIONS]
+usage: hipcc [OPTIONS] FILENAME ...
 .SH DESCRIPTION
-The hipcc script wraps clang or nvcc, depending on the value of the
-.RB $ HIP_PLATFORM
-reported by hipconfig. When building for the AMD platform, the options given to hipcc
-are passed through to clang. The hipcc script adds number of additional options to those
-explicitly given, including "\-x hip" to force the compilation language to HIP, "\-O3"
-when no optimization level is otherwise specified, and the autodetected target GPU
-architecture when no target has been otherwise specified.
+The hipcc script wraps clang or nvcc depending on the value of the
+.RB HIP_PLATFORM
+reported by hipconfig. When building for the AMD platform, the options given to
+hipcc are forwarded to clang. The hipcc script adds additional options to the
+list of arguments being forwarded, including
+.I \-x hip
+to force the compilation language to HIP,
+.I \-O3
+when no optimization level is specified, and
+.I \-\-offload-arch
+with an autodetected GPU architecture if
+no target architecture is specified.
 .SH ENVIRONMENT
 .TP
 .B HIPCC_VERBOSE
-The 
-.RB $ HIPCC_VERBOSE
-variable can enable additional output.
-.B 0x1 \- the clang command being invoked
-.B 0x2 \- the paths to HIP components
-.B 0x3 \- the arguments hipcc was called with
+This environment variable can enable additional output. The following flags
+control the level of verbosity to be printed:
+.IP \(bu 4
+.B 0x1
+The clang command being invoked.
+.IP \(bu 4
+.B 0x2
+The paths to HIP components.
+.IP \(bu 4
+.B 0x3
+The arguments hipcc was called with.
 .TP
 .B HIPCC_COMPILE_FLAGS_APPEND
-The contents of
-.B $ HIPCC_COMPILE_FLAGS_APPEND
-are added to the end of the argument list that is passed to clang when compiling.
-.TP
-The contents of
-.B $ HIPCC_LINK_FLAGS_APPEND
-are added to the end of the argument list that is passed to clang when linking.
+The contents of this variable are added to the end of the argument list that is
+passed to clang when compiling.
+.TP
+.B HIPCC_LINK_FLAGS_APPEND
+The contents of this variable are added to the end of the argument list that is
+passed to clang when linking.
 .TP
 .B DEVICE_LIB_PATH
-The 
-.RB $ DEVICE_LIB_PATH
-variable is used to override the path to the rocm\-device\-libs.
+This variable is used to override the path to the ROCm device libraries.
 .TP
 .B HIP_LIB_PATH
-The 
-.RB $ HIP_LIB_PATH
-variable is used to override the path to the HIP Runtime library directory.
+This variable is used to override the path to the directory containing the HIP runtime library.
 .TP
 .B HIP_CLANG_HCC_COMPAT_MODE
-The 
-.RB $ HIP_CLANG_HCC_COMPAT_MODE
-variable is used to specify if clang should allow half floats in arguments and return and
-if __HIP_HCC_COMPAT_MODE__ should be defined.
+This variable is used to specify if clang should enable options that make it
+more compatible with the old hcc compiler. This includes allowing the half
+float type to be used in function signatures as arguments or return values.
+When in this mode, the
+.B __HIP_HCC_COMPAT_MODE__
+preprocessor macro will also be defined.
 .TP
 .B HIP_CLANG_LAUNCHER
-The 
-.RB $ HIP_CLANG_LAUNCHER
-variable specifies a path to an executable to use for launching HIP clang.
-This can be useful for making use of a compiler caching tool.
+This environment variable specifies a path to an executable to use for
+launching HIP clang, which can be useful for making use of a compiler caching
+tool.
 .SH SEE ALSO
 .sp
 \fBhipconfig(1)\fP
diff -Nru rocm-hipamd-5.7.1/debian/patches/0041-inline-bf16-functions.patch rocm-hipamd-5.7.1/debian/patches/0041-inline-bf16-functions.patch
--- rocm-hipamd-5.7.1/debian/patches/0041-inline-bf16-functions.patch	1970-01-01 01:00:00.000000000 +0100
+++ rocm-hipamd-5.7.1/debian/patches/0041-inline-bf16-functions.patch	2025-10-12 21:41:23.000000000 +0200
@@ -0,0 +1,73 @@
+From: Cordell Bloor <cgmb@debian.org>
+Date: Sat, 4 Oct 2025 00:28:49 -0600
+Subject: inline bf16 functions
+
+Resolves multiple-definition errors observed when building
+libraries and applications:
+
+  ggml: https://github.com/ggml-org/llama.cpp/pull/15296
+  mscclpp: https://github.com/microsoft/mscclpp/issues/349
+
+Bug-Debian: https://bugs.debian.org/1116585
+---
+ hipamd/include/hip/amd_detail/amd_hip_bf16.h | 12 ++++++------
+ 1 file changed, 6 insertions(+), 6 deletions(-)
+
+diff --git a/hipamd/include/hip/amd_detail/amd_hip_bf16.h b/hipamd/include/hip/amd_detail/amd_hip_bf16.h
+index 50b8a18..fd526cf 100644
+--- a/hipamd/include/hip/amd_detail/amd_hip_bf16.h
++++ b/hipamd/include/hip/amd_detail/amd_hip_bf16.h
+@@ -134,7 +134,7 @@ __HOST_DEVICE__ inline float __bfloat162float(__hip_bfloat16 a) {
+  * \ingroup HIP_INTRINSIC_BFLOAT16_CONV
+  * \brief Converts float to bfloat16
+  */
+-__HOST_DEVICE__ __hip_bfloat16 __float2bfloat16(float f) {
++__HOST_DEVICE__ inline __hip_bfloat16 __float2bfloat16(float f) {
+   __hip_bfloat16 ret;
+   union {
+     float fp32;
+@@ -178,7 +178,7 @@ __HOST_DEVICE__ __hip_bfloat16 __float2bfloat16(float f) {
+  * \ingroup HIP_INTRINSIC_BFLOAT162_CONV
+  * \brief Converts and moves bfloat162 to float2
+  */
+-__HOST_DEVICE__ float2 __bfloat1622float2(const __hip_bfloat162 a) {
++__HOST_DEVICE__ inline float2 __bfloat1622float2(const __hip_bfloat162 a) {
+   return float2{__bfloat162float(a.x), __bfloat162float(a.y)};
+ }
+ 
+@@ -206,7 +206,7 @@ __device__ unsigned short int __bfloat16_as_ushort(const __hip_bfloat16 h) { ret
+  * \ingroup HIP_INTRINSIC_BFLOAT162_CONV
+  * \brief Convert double to __hip_bfloat16
+  */
+-__HOST_DEVICE__ __hip_bfloat16 __double2bfloat16(const double a) {
++__HOST_DEVICE__ inline __hip_bfloat16 __double2bfloat16(const double a) {
+   return __float2bfloat16((float)a);
+ }
+ 
+@@ -214,7 +214,7 @@ __HOST_DEVICE__ __hip_bfloat16 __double2bfloat16(const double a) {
+  * \ingroup HIP_INTRINSIC_BFLOAT162_CONV
+  * \brief Convert float2 to __hip_bfloat162
+  */
+-__HOST_DEVICE__ __hip_bfloat162 __float22bfloat162_rn(const float2 a) {
++__HOST_DEVICE__ inline __hip_bfloat162 __float22bfloat162_rn(const float2 a) {
+   return __hip_bfloat162{__float2bfloat16(a.x), __float2bfloat16(a.y)};
+ }
+ 
+@@ -244,7 +244,7 @@ __device__ __hip_bfloat162 __high2bfloat162(const __hip_bfloat162 a) {
+  * \ingroup HIP_INTRINSIC_BFLOAT162_CONV
+  * \brief Converts high 16 bits of __hip_bfloat162 to float and returns the result
+  */
+-__HOST_DEVICE__ float __high2float(const __hip_bfloat162 a) { return __bfloat162float(a.y); }
++__HOST_DEVICE__ inline float __high2float(const __hip_bfloat162 a) { return __bfloat162float(a.y); }
+ 
+ /**
+  * \ingroup HIP_INTRINSIC_BFLOAT162_CONV
+@@ -272,7 +272,7 @@ __device__ __hip_bfloat162 __low2bfloat162(const __hip_bfloat162 a) {
+  * \ingroup HIP_INTRINSIC_BFLOAT162_CONV
+  * \brief Converts low 16 bits of __hip_bfloat162 to float and returns the result
+  */
+-__HOST_DEVICE__ float __low2float(const __hip_bfloat162 a) { return __bfloat162float(a.x); }
++__HOST_DEVICE__ inline float __low2float(const __hip_bfloat162 a) { return __bfloat162float(a.x); }
+ 
+ /**
+  * \ingroup HIP_INTRINSIC_BFLOAT162_CONV
diff -Nru rocm-hipamd-5.7.1/debian/patches/series rocm-hipamd-5.7.1/debian/patches/series
--- rocm-hipamd-5.7.1/debian/patches/series	2025-05-13 12:09:20.000000000 +0200
+++ rocm-hipamd-5.7.1/debian/patches/series	2025-10-12 21:41:23.000000000 +0200
@@ -20,3 +20,4 @@
 0038-fix-FindHIP-search-for-HIP_CLANG_PATH.patch
 0039-fix-roc-obj-help.patch
 0040-self-sufficient-bf16-header.patch
+0041-inline-bf16-functions.patch
diff -Nru rocm-hipamd-5.7.1/debian/roc-obj-ls.1 rocm-hipamd-5.7.1/debian/roc-obj-ls.1
--- rocm-hipamd-5.7.1/debian/roc-obj-ls.1	2025-05-13 12:15:39.000000000 +0200
+++ rocm-hipamd-5.7.1/debian/roc-obj-ls.1	2025-10-12 21:41:23.000000000 +0200
@@ -5,7 +5,7 @@
 .B roc-obj-ls
 [\fI\,-v|h\/\fR] \fI\,executable\/\fR...
 .SH DESCRIPTION
-List the URIs of the code objects embedded in the specfied host executables.
+List the URIs of the code objects embedded in the specified host executables.
 .SH OPTIONS
 \fB\-v\fR      Verbose output (includes Entry ID)
 .TP

Reply to: