Bug#1063673: ITP: llama.cpp -- Inference of Meta's LLaMA model (and others) in pure C/C++
On 2025-02-06 02:42, M. Zhou wrote:
> I second this. llama-server is also the service endpoint for DebGPT.
I'll prioritize fixing this.
> I pushed a fix for ppc64el. The hwcaps works correctly for power9, given the baseline is power 8.
Ah good catch. The broken install pattern was due to a last-minute fix
that I only tested on amd64...
I meant to ask anyway: performance-wise, is it comparable to your local
build? I mean, I wouldn't know what in the code would alter this, but I
built and tested this on platti.d.o and performance was poor, so another
data point would be useful.
Best,
Christian
Reply to: