llama-cpp with AMD GPUs

To: debian-ai <debian-ai@lists.debian.org>
Subject: llama-cpp with AMD GPUs
From: Cordell Bloor <cgmb@slerp.xyz>
Date: Sat, 10 Feb 2024 14:17:11 -0700
Message-id: <[🔎] 39620f27-dbd1-4756-ab3d-e7464712b880@slerp.xyz>

Hi folks,

I was playing around with llama-cpp yesterday. I thought I'd share instructions for building and running with AMD GPU acceleration on Debian Testing and Unstable (or Ubuntu 23.10). I believe this should work on most discrete AMD GPUs with sufficient VRAM released between 2017 and 2022, specifically Vega, RDNA 1, RDNA 2, CDNA 1 and CDNA 2 GPUs:

apt -y update
apt -y upgrade
apt -y install git wget hipcc libhipblas-dev librocblas-dev cmake build-essential
wget https://huggingface.co/TheBloke/dolphin-2.2.1-mistral-7B-GGUF/resolve/main/dolphin-2.2.1-mistral-7b.Q5_K_M.gguf?download=true -O dolphin-2.2.1-mistral-7b.Q5_K_M.gguf
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
git checkout b2110
CC=clang-15 CXX=clang++-15 cmake -H. -Bbuild -DLLAMA_HIPBLAS=ON -DAMDGPU_TARGETS="gfx803;gfx900;gfx906;gfx908;gfx90a;gfx1010;gfx1030" -DCMAKE_BUILD_TYPE=Release
make -j16 -C build
build/bin/main -ngl 32 --color -c 2048 --temp 0.7 --repeat_penalty 1.1 -n -1 -m ../dolphin-2.2.1-mistral-7b.Q5_K_M.gguf --prompt "Once upon a time"

I'm not familiar enough with llama-cpp to file an RFS yet. Still, I thought this was interesting and wanted to share.

Sincerely,
Cory Bloor

Reply to:

Follow-Ups:
- Re: llama-cpp with AMD GPUs
  - From: Christian Kastner <ckk@debian.org>

Prev by Date: Re: ROCm CI for OpenCL packages?
Next by Date: Re: llama-cpp with AMD GPUs
Previous by thread: Re: ROCm CI for OpenCL packages?
Next by thread: Re: llama-cpp with AMD GPUs
Index(es):
- Date
- Thread