[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Non-LLM example where we do not in practice use original training data



Clint Adams <clint@debian.org> writes:
> On Tue, May 06, 2025 at 08:36:50AM -0700, Russ Allbery wrote:

>> Well, first, I continue to object to the idea that a model can be
>> DFSG-free if it's trained on non-DFSG-free data. I think that makes it
>> definitionally non-free. (I have read Aigars's arguments to the
>> contrary and do not find them at all persusasive.)

> We appear to have plenty of pre-trained models, apparently trained on
> non-DFSG-free data, in main right now, which strikes me as a violation
> of our current "preferred form of modification" rule.

Yes. That's the conclusion I've arrived at as well, after thinking about
this over the course of the discussion, although I suppose it's also an
argument that I'm thinking about this wrong and the current status quo is
fine.

This is not something we've paid a lot of attention to, and I think we've
defaulted to accepting stuff that claims to be under DFSG licenses. That's
certainly what I did with gnubg when I was maintaining it. I never really
thought about this issue. That means there are some practical problems
with changing the de facto policy.

I think if any of the options in the current GR except Aigars's (and maybe
Sam's?) passes, that would effectively be a change in our current policy,
even if the current policy is not precisely intentional. Personally, I
think it would bring us back closer in line with our principles, but that
doesn't make the practical problems go away. Right now is a really bad
time to change our policies. Whatever we do, I don't think we should try
to change anything before the release. Shortly after the release would be
a much better time, so that we have some time to sort this out.

We have previously had a lot of problems with the implementation of
changes (either via GR or via delegate interpretation) from a de facto
licensing policy and had to thrash out the implications with multiple GRs
[1], which isn't very fun. I'm not sure that we've thought through the
implications of this proposed change yet, and I'm not sure that we have a
plan. The plan doesn't need to be in the GR, but I'd feel more comfortable
if we had a list of affected packages and some idea of what we're going to
do with them.

That means I'm not sure how to vote on the current proposal as it
currently stands. I'd rather not have to do a second GR just to clarify
the timing of implementation given the upcoming release. Maybe there's
still time to address that directly? Also, there is absolutely nothing
wrong with temporarily withdrawing a GR (or even having it fail because we
didn't realize in previous debian-project discussions that we'd not fully
worked through the idea yet), and then bringing it back up when we have an
implementation plan.

[1] https://www.debian.org/vote/2004/vote_003
    https://www.debian.org/vote/2004/vote_004
    https://www.debian.org/vote/2006/vote_007
    https://www.debian.org/vote/2008/vote_003

-- 
Russ Allbery (rra@debian.org)              <https://www.eyrie.org/~eagle/>


Reply to: