Re: Building llama.cpp for AMD GPU using only Debian packages?
On Fri, 31 Jan 2025 at 19:06, Petter Reinholdtsen <pere@hungry.com> wrote:
>
> [Jamie Bainbridge]
> > I wonder if you can reproduce these results? I tried with Q80 and Q6KL
> > models which all fit into VRAM.
>
> I can at least do a test build if you provide the recipe. If you
> provide me with the apt install and cmake lines, I will give it a go. :)
The easiest way is the upstream container. This is built from Ubuntu
with the official Vulkan SDK.
This rootless podman command works for me on Bookworm with backports
kernel, it should work for you too on Trixie (replace your
"/path/to/models"):
$ podman run --rm -it --name vulkan \
--volume /path/to/models:/models \
--device /dev/dri --user 1000:1000 --group-add keep-groups \
--publish 8000:8000 \
ghcr.io/ggerganov/llama.cpp:server-vulkan \
-m /models/Meta-Llama-3.1-8B-Instruct-Q6_K.gguf \
--port 8000 --host 0.0.0.0 --n-gpu-layers 99
> > I am not sure of steps to compile llama.cpp for Vulkan using only
> > Debian libraries.
>
> It is my preferred scenario, and I am running Debian 13/Trixie with
> regular updates. :)
The source of the above container is:
https://github.com/ggerganov/llama.cpp/blob/master/.devops/vulkan.Dockerfile
If you wished to use all Debian, I think you would need to install the
Vulkan SDK from tarball, I have not done this:
https://vulkan.lunarg.com/doc/sdk/1.4.304.0/linux/getting_started.html
However, I have had success installing the Ubuntu Jammy Vulkan SDK
packages in a Bookworm container. Even though such cross-distro usage
is frowned upon, it's just a throwaway unprivileged container so no
harm done. This works for me:
$ distrobox create --image quay.io/toolbx-images/debian-toolbox:12
--name llamabox --hostname llamabox
$ distrobox enter llamabox
## you are now inside the unpriviledged Bookworm container
$ sudo apt install build-essential cmake git wget
$ wget -qO- https://packages.lunarg.com/lunarg-signing-key-pub.asc |
sudo tee /etc/apt/trusted.gpg.d/lunarg.asc
$ sudo wget -qO /etc/apt/sources.list.d/lunarg-vulkan-jammy.list
http://packages.lunarg.com/vulkan/lunarg-vulkan-jammy.list
$ sudo apt update
$ sudo apt install vulkan-sdk
$ git clone https://github.com/ggerganov/llama.cpp
$ cd llama.cpp
$ mkdir build
$ cmake -B build -DGGML_NATIVE=OFF -DGGML_VULKAN=1
-DBUILD_SHARED_LIBS=OFF -DCMAKE_BUILD_TYPE=Release
$ make -j $(nproc) -C /app llama-server
## replace your "/path/to/" in the following command
$ ./build/bin/llama-server --host 0.0.0.0 --port 8000 --model
/path/to/Meta-Llama-3.1-8B-Instruct-Q6_K.gguf --n-gpu-layers 99
## when you're done, end the server with Ctrl+c, exit the distrobox
with Ctrl+d (logout)
I hope one of these options is useful for you.
Jamie
Reply to: