Bug#1063673: ITP: llama.cpp -- Inference of Meta's LLaMA model (and others) in pure C/C++

To: Petter Reinholdtsen <pere@hungry.com>, 1063673@bugs.debian.org
Subject: Bug#1063673: ITP: llama.cpp -- Inference of Meta's LLaMA model (and others) in pure C/C++
From: Christian Kastner <ckk@debian.org>
Date: Tue, 13 Feb 2024 20:41:47 +0100
Message-id: <[🔎] fdedee66-9a55-475e-9e23-acfdfc351025@debian.org>
Reply-to: Christian Kastner <ckk@debian.org>, 1063673@bugs.debian.org
In-reply-to: <[🔎] sa6mss4bytd.fsf@hjemme.reinholdtsen.name>
References: <[🔎] d373f55c-2869-490b-aeaf-0fba8c10c02e@debian.org> <[🔎] d373f55c-2869-490b-aeaf-0fba8c10c02e@debian.org> <[🔎] sa6mss4bytd.fsf@hjemme.reinholdtsen.name> <[🔎] d373f55c-2869-490b-aeaf-0fba8c10c02e@debian.org>

Hi Petter,

On 2024-02-13 08:36, Petter Reinholdtsen wrote:
> I tried building the CPU edition on one machine and run it on another,
> and experienced illegal instruction exceptions.  I suspect this mean one
> need to be careful when selecting build profile to ensure it work on all
> supported Debian platforms.

yeah, that was my conclusion from my first experiments as well.

This is a problem though, since one key point of llama.cpp is to make
best use of the current hardware. If we'd target some 15-year-old amd64
lowest common denominator, we'd go against that.

In my first experiments, I've also had problems with ROCm builds on
hosts without a GPU.

I have yet to investigate if/how capabilities can be generally enabled,
and use determined at runtime.

Another issue that stable is clearly the wrong distribution for this.
This is a project that is continuously gaining new features, so we'd
need to stable-updates.

> I would be happy to help getting this up and running.  Please let me
> know when you have published a git repo with the packaging rules.

I'll push a first draft soon, though it will definitely not be
upload-ready for the above reasons.

Best,
Christian

Reply to:

References:
- Bug#1063673: ITP: llama.cpp -- Inference of Meta's LLaMA model (and others) in pure C/C++
  - From: Christian Kastner <ckk@debian.org>
- Bug#1063673: ITP: llama.cpp -- Inference of Meta's LLaMA model (and others) in pure C/C++
  - From: Petter Reinholdtsen <pere@hungry.com>

Prev by Date: Bug#1063628: marked as done (ITP: python-command-runner -- a platform-agnostic external command execution library for python with extra goodies)
Next by Date: Bug#1063868: O: xsecurelock -- X11 screen lock utility with the primary goal of security
Previous by thread: Bug#1063673: ITP: llama.cpp -- Inference of Meta's LLaMA model (and others) in pure C/C++
Next by thread: Bug#1063674: RFP: llama.cpp -- Port of Facebook's LLaMA model in C/C++
Index(es):
- Date
- Thread