Re: Bug#1063673: ITP: llama.cpp -- Inference of Meta's LLaMA model (and others) in pure C/C++
On 2025-02-06 15:14, M. Zhou wrote:
> For ppc64el, the llama.cpp-blas backend is way slower than the -cpu backend.
Actually I observed the same on amd64.
But only in my build, because I deactivate all feature flags in the CPU
backend, so -cpu is at our amd64 baseline.
>
> CPU is slow anyway. How does HIP performs?
It massively outperforms CPU, in line with Cory's observation [1].
Best,
Christian
[1]: https://lists.debian.org/debian-ai/2025/01/msg00136.html
Reply to: