Re: GR -- Allow AI-Assisted Contributions
Hi Lucas
On 2026/02/19 11:14, Lucas Nussbaum wrote:
If we were to adopt a hard-line "anti-tools" stance, I would find it
very hard to draw a clear line.
In terms of LLMs, I agree with the sentiment of others that they are in
many ways, mass plagiarism tools. Just imagine 15-20 years ago if some
kid in their garage scraped Oracle, Google and Microsoft's code to
create models that spit out code. They'd probably still be in jail.
That said, people are using them, in large amounts, and I admit I've
found it useful too.
For example, a few weeks ago I was working late one night and I couldn't
put my finger on it, but my one loop just looked really wrong and ugly,
so I searched on duck duck go to go find some patterns that look nice
that fit my use case, and Duck Duck AI popped up and suggested a very
neat and elegant list comprehension that was such an obviously good
choice, that I really should have thought of it in the first place.
Now, if I copy and paste that one line in my code, I've used an LLM. In
my opinion, that one liner is way too trivial to copyright so I'm not
even going to credit the LLM with that. And, I understand exactly what
it does and how to modify it, if I need to do so in the future.
Now, if we take a hard-line "anti-tools" stance, it would probably
disqualify my whole project from entering Debian, for just a trivial
suggestion from an LLM. I'm not sure that makes any sense.
Also, I'm pretty sure that there's even already quite a bit of code in
stable that was massaged a bit or contains suggestions from LLMs, so for
people who want to delay discussions, I think now is a good time to
discuss it, especially considering how strongly people feel about some
aspects of it, and upstreams are increasingly using it too (as others
have pointed out, even the Linux kernel).
I also don't think that we should need to write a comprehensive guide on
AIs/LLMs in a GR yet. A few simple lines like in the Gentoo policy (as
other's have referenced) with a few simple things that we can agree on
would make the most sense.
As I've already mentioned, copyright is one potential big problem, but
another is slop. The execs *love* showing that you can type a sentence
and the AI spits out a completely seemingly working webapp. And then I
would ask, that password on the login screen? Is it encrypted? Is it
salted? Then you look at the code and there is no user database, the
password is just hardcoded as admin/admin. And as you interact with the
LLM it will just assure you that you are 100% right as it just messes up
everything further. Perhaps one day the AI products will right good,
maintainable code from the start, but everything I've seen so far is
slop and unmaintainable and I think that also makes it unsuitable for
Debian.
So my thoughts in summary:
* LLMs present copyright uncertainties
* LLMs can produce large amounts of low quality code, quickly
* Despite the above, they can be useful in free software, and people
*do* use them, even quite widely)
* From the Debian side, I think a simple statement / guideline that we
can agree on is sufficient, not a comprehensive guide/policy
(also, I need more caffeine so please excuse bad grammar etc)
-Jonathan
Reply to: