Re: Non-LLM example where we do not in practice use original training data
>>>>> "Stefano" == Stefano Zacchiroli <zack@debian.org> writes:
Stefano> What I strongly suspect would happen, if proposal A wins
Stefano> (which I also consider quite likely) is that Debian
Stefano> maintainers of free software products that use trained ML
Stefano> models that lack DFSG-free training data, will have to go
Stefano> down the rabbit hole of patching those software to
Stefano> systematically download the models on first use. Or just
Stefano> give up on maintaining those packages, of course.
For me this would give up on one of the big befenits of Debian.
Debian is mostly self-contained.
If I can restrict myself to things in the archive, I can throw a full
Debian mirror into environments where I cannot reach the internet and
mostly get very good results.
The more Debian moves to a model where it encourages downloading
non-mirrored artifacts, the harder that use case becomes.
I don't care whether the artifacts I need are in main. I would be fine
with another archive section.
But I suspect you are right and rather than going through that
complexity, especially since stuff in main cannot even recommend outside
of main, they will download because it provides a better experience than
trying to support a model data package in non-free.
It will also enhance challenges when versions of software in stable want
to use models that are not in the places they used to be.
Reply to: