Re: Building llama.cpp for AMD GPU using only Debian packages?

To: Petter Reinholdtsen <pere@hungry.com>
Cc: debian-ai@lists.debian.org
Subject: Re: Building llama.cpp for AMD GPU using only Debian packages?
From: Jamie Bainbridge <jamie.bainbridge@gmail.com>
Date: Fri, 31 Jan 2025 10:39:53 +1000
Message-id: <[🔎] CAAvyFNjufDunbeV42O6Or4ty7bxXerbwOF_0STbvXSVkqus95w@mail.gmail.com>
In-reply-to: <[🔎] sa6sep3dh5h.fsf@hjemme.reinholdtsen.name>
References: <sa6sewtv5km.fsf@hjemme.reinholdtsen.name> <a2e3c24b-6459-41cb-995f-93bb4bdfee97@slerp.xyz> <sa6wmm5s28m.fsf@hjemme.reinholdtsen.name> <sa6r0cds1br.fsf@hjemme.reinholdtsen.name> <af3fa2a8-63f0-44ec-bb0e-5105a3235f6b@slerp.xyz> <a6840953-4656-47ed-96f7-ee7fe2f5b247@slerp.xyz> <sa6tth8p2rk.fsf@hjemme.reinholdtsen.name> <sa6a5izpnvp.fsf@hjemme.reinholdtsen.name> <sa64j96q4gg.fsf@hjemme.reinholdtsen.name> <[🔎] sa6sep3dh5h.fsf@hjemme.reinholdtsen.name>

On Tue, 28 Jan 2025 at 08:25, Petter Reinholdtsen <pere@hungry.com> wrote:
>
> Here is a small update on building llama.cpp with support for (at least
> my) AMD GPU.  The cmake arguments for the latest github edition (commit
> d6d24cd9ed6d0b9558643dcc28f2124bef488c52) have changed slightly since
> the first recipe in this thread, so here is the one I used to
> successfully build
>
>   HIPCXX=clang-17 cmake -H. -Bbuild -DCMAKE_BUILD_TYPE=Release \
>     -DGGML_HIP=ON -DAMDGPU_TARGETS=gfx1100  && \
>     make -C build -j32
>
> I picked the cmake arguments from docs/build.md, but the ones listed
> there had to be modified to use clang-17 instead of clang (ie v19), and
> I did not need to specify HIP_PATH.
>
> --
> Happy hacking
> Petter Reinholdtsen

I am not sure this is actually needed anymore.

In the past I found Vulkan on RDNA1 (5600XT) was significantly slower
than ROCm, like half the text generation speed.

I was recently advised to try Vulkan again. I found RDNA1 (5600XT)
Vulkan runs the same speed as ROCm, and RDNA2 (6600XT) Vulkan runs
faster than ROCm by about 10%!

AMD's marketing for RDNA3 (7900XTX) also uses Vulkan to spruik
performance gains over competing cards with CUDA. The XTX is even
officially supported in ROCm, so surely if ROCm was faster they'd use
that result instead.

It seems at least for RDNA1 and newer, we're better to use Vulkan with
llama.cpp now.

I wonder if you can reproduce these results? I tried with Q80 and Q6KL
models which all fit into VRAM.

I just tested quickly with the LM Studio Vulkan runtime. I have
previously compiled llama.cpp in an Ubuntu podman container with the
official Vulkan SDK, then ran that container on Debian 12. I am not
sure of steps to compile llama.cpp for Vulkan using only Debian
libraries.

Jamie

Reply to:

Follow-Ups:
- Re: Building llama.cpp for AMD GPU using only Debian packages?
  - From: Petter Reinholdtsen <pere@hungry.com>
- Re: Building llama.cpp for AMD GPU using only Debian packages?
  - From: Cordell Bloor <cgmb@slerp.xyz>

References:
- Re: Building llama.cpp for AMD GPU using only Debian packages?
  - From: Petter Reinholdtsen <pere@hungry.com>

Prev by Date: Bug#1094763: libamdhip64-dev: CMake config helpers are installed in the wrong directory
Next by Date: Raspberry Pi + Radeon PRO W7700
Previous by thread: Re: Building llama.cpp for AMD GPU using only Debian packages?
Next by thread: Re: Building llama.cpp for AMD GPU using only Debian packages?
Index(es):
- Date
- Thread