[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1063673: ITP: llama.cpp -- Inference of Meta's LLaMA model (and others) in pure C/C++



On 2025-02-06 02:42, M. Zhou wrote:
> I second this. llama-server is also the service endpoint for DebGPT.

I'll prioritize fixing this.

> I pushed a fix for ppc64el. The hwcaps works correctly for power9, given the baseline is power 8.

Ah good catch. The broken install pattern was due to a last-minute fix
that I only tested on amd64...

I meant to ask anyway: performance-wise, is it comparable to your local
build? I mean, I wouldn't know what in the code would alter this, but I
built and tested this on platti.d.o and performance was poor, so another
data point would be useful.

Best,
Christian


Reply to: