[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Bug#1063673: ITP: llama.cpp -- Inference of Meta's LLaMA model (and others) in pure C/C++



On 2025-02-06 15:14, M. Zhou wrote:
> For ppc64el, the llama.cpp-blas backend is way slower than the -cpu backend.

Actually I observed the same on amd64.

But only in my build, because I deactivate all feature flags in the CPU
backend, so -cpu is at our amd64 baseline.
> 
> CPU is slow anyway. How does HIP performs?

It massively outperforms CPU, in line with Cory's observation [1].

Best,
Christian

[1]: https://lists.debian.org/debian-ai/2025/01/msg00136.html


Reply to: