[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

llama-cpp with AMD GPUs



Hi folks,

I was playing around with llama-cpp yesterday. I thought I'd share instructions for building and running with AMD GPU acceleration on Debian Testing and Unstable (or Ubuntu 23.10). I believe this should work on most discrete AMD GPUs with sufficient VRAM released between 2017 and 2022, specifically Vega, RDNA 1, RDNA 2, CDNA 1 and CDNA 2 GPUs:

apt -y update
apt -y upgrade
apt -y install git wget hipcc libhipblas-dev librocblas-dev cmake build-essential
wget https://huggingface.co/TheBloke/dolphin-2.2.1-mistral-7B-GGUF/resolve/main/dolphin-2.2.1-mistral-7b.Q5_K_M.gguf?download=true -O dolphin-2.2.1-mistral-7b.Q5_K_M.gguf
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
git checkout b2110
CC=clang-15 CXX=clang++-15 cmake -H. -Bbuild -DLLAMA_HIPBLAS=ON -DAMDGPU_TARGETS="gfx803;gfx900;gfx906;gfx908;gfx90a;gfx1010;gfx1030" -DCMAKE_BUILD_TYPE=Release
make -j16 -C build
build/bin/main -ngl 32 --color -c 2048 --temp 0.7 --repeat_penalty 1.1 -n -1 -m ../dolphin-2.2.1-mistral-7b.Q5_K_M.gguf --prompt "Once upon a time"

I'm not familiar enough with llama-cpp to file an RFS yet. Still, I thought this was interesting and wanted to share.

Sincerely,
Cory Bloor


Reply to: