Re: Proposal -- Interpretation of DFSG on Artificial Intelligence (AI) Models

To: Aigars Mahinovs <aigarius@gmail.com>
Cc: Russ Allbery <rra@debian.org>, Matthias Urlichs <matthias@urlichs.de>, debian-vote@lists.debian.org
Subject: Re: Proposal -- Interpretation of DFSG on Artificial Intelligence (AI) Models
From: Gunnar Wolf <gwolf@debian.org>
Date: Mon, 28 Apr 2025 18:05:14 -0600
Message-id: <[🔎] aBAXum_3louE9mrc@iiec.unam.mx>
In-reply-to: <[🔎] CABpYwDUBjmsaED7KRCscQCz9V4apZesYKeyJwpAq2UDcn6UKYQ@mail.gmail.com>
References: <[🔎] 6a60f2f9e7e719aab39e5d21a623d8bac848b9ab.camel@debian.org> <[🔎] aAfPA6IqfoDLnAhs@layer-acht.org> <[🔎] 40e7d297d72014365dad8be242a359c2b06ac7d3.camel@debian.org> <[🔎] a351e052-ab6c-4f66-9f6c-0db8064e990c@urlichs.de> <[🔎] CABpYwDUeRawmtUqjnQTYhZ5Kwt+82PFPUXZK2LN1O9GV8CSkOQ@mail.gmail.com> <[🔎] 87a580s0b5.fsf@hope.eyrie.org> <[🔎] CABpYwDUBjmsaED7KRCscQCz9V4apZesYKeyJwpAq2UDcn6UKYQ@mail.gmail.com>

Aigars Mahinovs dijo [Mon, Apr 28, 2025 at 09:24:04PM +0200]:

(...)
So, very precisely speaking, modification of a LLM does *not* require the
original training data. Recreating a LLM does. Also developing a new LLM
with different training methods or training conditions does need some
training data (ideally the original training data, especially to compare
end performance). But all in all a developer on a Desert Island would be
better off with a "binary" model to be modified than without it.

Say for example that an IDE saves its configuration state not in a common
text file, but as a binary memory dump. Say the maintainer of such a
package would use their experience of the IDE and years of development to
go through the GUI of this software to assemble a great setup configuration
that is great for anyone starting to use the IDE and also has clues left
around it how to tailor it further for your needs. This configuration (as a
binary memory dump of the software state) is then distributed to the users
as the default configuration. What is "the source" of it? Isn't this binary
(that the GUI can both read and write) not the preferred form for
modification? The maintainer can describe how he created the GUI state
(document the training process), but not really include all his relevant
experience (training data) that led him to believe that this state is the
best for the new users. So what is LLama if not a **very** complex nvim
configfile focused on autocomplete? :D Quite a few of those questions also
apply to fonts (IMO).


Exactly what I tried to present in my latest mail. I think you did a better
job at explaining than myself. Thanks!

Reply to:

References:
- Proposal -- Interpretation of DFSG on Artificial Intelligence (AI) Models
  - From: "M. Zhou" <lumin@debian.org>
- Re: Proposal -- Interpretation of DFSG on Artificial Intelligence (AI) Models
  - From: Holger Levsen <holger@layer-acht.org>
- Re: Proposal -- Interpretation of DFSG on Artificial Intelligence (AI) Models
  - From: Ansgar 🙀 <ansgar@debian.org>
- Re: Proposal -- Interpretation of DFSG on Artificial Intelligence (AI) Models
  - From: Matthias Urlichs <matthias@urlichs.de>
- Re: Proposal -- Interpretation of DFSG on Artificial Intelligence (AI) Models
  - From: Aigars Mahinovs <aigarius@gmail.com>
- Re: Proposal -- Interpretation of DFSG on Artificial Intelligence (AI) Models
  - From: Russ Allbery <rra@debian.org>
- Re: Proposal -- Interpretation of DFSG on Artificial Intelligence (AI) Models
  - From: Aigars Mahinovs <aigarius@gmail.com>

Prev by Date: Re: Re: Proposal -- Interpretation of DFSG on Artificial Intelligence (AI) Models
Next by Date: Re: Proposal -- Interpretation of DFSG on Artificial Intelligence (AI) Models
Previous by thread: Re: Proposal -- Interpretation of DFSG on Artificial Intelligence (AI) Models
Next by thread: Re: Proposal -- Interpretation of DFSG on Artificial Intelligence (AI) Models
Index(es):
- Date
- Thread