Re: Bug#1063673: ITP: llama.cpp -- Inference of Meta's LLaMA model (and others) in pure C/C++

To: "M. Zhou" <lumin@debian.org>, 1063673@bugs.debian.org, Cordell Bloor <cgmb@slerp.xyz>, debian-ai@lists.debian.org
Subject: Re: Bug#1063673: ITP: llama.cpp -- Inference of Meta's LLaMA model (and others) in pure C/C++
From: Christian Kastner <ckk@debian.org>
Date: Mon, 27 Jan 2025 22:35:03 +0100
Message-id: <[🔎] 1b6a1017-3f10-4825-9ea8-08b681b08217@debian.org>
In-reply-to: <[🔎] 17aea120c8bd49c987d901f2fc5d8afe02d6f329.camel@debian.org>
References: <d373f55c-2869-490b-aeaf-0fba8c10c02e@debian.org> <d373f55c-2869-490b-aeaf-0fba8c10c02e@debian.org> <sa6mss4bytd.fsf@hjemme.reinholdtsen.name> <fdedee66-9a55-475e-9e23-acfdfc351025@debian.org> <d373f55c-2869-490b-aeaf-0fba8c10c02e@debian.org> <sa65xxw4jhn.fsf@hjemme.reinholdtsen.name> <22d3e2d2-cfbd-431d-9211-e902ac3dfe4b@debian.org> <22d3e2d2-cfbd-431d-9211-e902ac3dfe4b@debian.org> <d373f55c-2869-490b-aeaf-0fba8c10c02e@debian.org> <de29a469-6c9b-4025-bbed-988e10dc5a38@slerp.xyz> <0aa4f182-da25-4ba5-8d9f-a1d1f8ad9221@debian.org> <ece647c1-3dba-4737-a215-c93112990fe4@debian.org> <7976e018-a547-4bba-82ba-13847980356e@debian.org> <efae84f0-dfa9-4cd2-a869-752ae1bd22cd@debian.org> <[🔎] 9860cecd-939c-40f0-873a-a81a23c71c67@debian.org> <[🔎] 6b3518bc-7d67-4fb9-a0e4-06e4e2bafd3e@slerp.xyz> <[🔎] 07c549a6-cc88-468e-8641-14f2c66b24aa@debian.org> <d373f55c-2869-490b-aeaf-0fba8c10c02e@debian.org> <[🔎] 17aea120c8bd49c987d901f2fc5d8afe02d6f329.camel@debian.org>

On 2025-01-27 15:37, M. Zhou wrote:
> On Mon, 2025-01-27 at 11:13 +0100, Christian Kastner wrote:
> BLAS itself only handles float32, float64, complex float32, and complex
> float64 datatypes, which are typically "s", "d", "c", "z" prefixes in the
> API. The quantized neural networks are not likely running in float point
> mode, but int mode like int4 and int8.

Ah, good to know.

> You may take libtorch2.5 as a reference, while building against
> libblas-dev, we may manually recommend high-performance BLAS
> implementations when the user installs:
> 
> Recommends: libopenblas0 | libblis4 | libmkl-rt | libblas3

I'll use that. Thanks!

Best,
Christian

Reply to:

References:
- Re: Bug#1063673: ITP: llama.cpp -- Inference of Meta's LLaMA model (and others) in pure C/C++
  - From: Christian Kastner <ckk@debian.org>
- Re: Bug#1063673: ITP: llama.cpp -- Inference of Meta's LLaMA model (and others) in pure C/C++
  - From: Cordell Bloor <cgmb@slerp.xyz>
- Re: Bug#1063673: ITP: llama.cpp -- Inference of Meta's LLaMA model (and others) in pure C/C++
  - From: Christian Kastner <ckk@debian.org>
- Re: Bug#1063673: ITP: llama.cpp -- Inference of Meta's LLaMA model (and others) in pure C/C++
  - From: "M. Zhou" <lumin@debian.org>

Prev by Date: Re: upcoming Blender 4.4 ROCm driver requirements
Next by Date: Re: Building llama.cpp for AMD GPU using only Debian packages?
Previous by thread: Re: Bug#1063673: ITP: llama.cpp -- Inference of Meta's LLaMA model (and others) in pure C/C++
Next by thread: python-einops_0.8.0-1~exp1_amd64.changes ACCEPTED into experimental
Index(es):
- Date
- Thread