Re: [RFCv2] Counter-Proposal -- Interpretation of DFSG on Artificial Intelligence (AI) Models

To: Thorsten Glaser <tg@debian.org>
Cc: debian-vote@lists.debian.org
Subject: Re: [RFCv2] Counter-Proposal -- Interpretation of DFSG on Artificial Intelligence (AI) Models
From: Simon Josefsson <simon@josefsson.org>
Date: Fri, 09 May 2025 09:54:03 +0200
Message-id: <[🔎] 877c2qi5ic.fsf@josefsson.org>
In-reply-to: <[🔎] Pine.BSM.4.64L.2505082103130.1959@herc.mirbsd.org> (Thorsten Glaser's message of "Thu, 8 May 2025 21:16:09 +0000 (UTC)")
References: <Pine.BSM.4.64L.2504232135570.23545@herc.mirbsd.org> <[🔎] 87y0v8d2je.fsf@josefsson.org> <[🔎] Pine.BSM.4.64L.2505072247150.32760@herc.mirbsd.org> <[🔎] 87v7qb99rh.fsf@josefsson.org> <[🔎] 1eea329a-8c5f-b664-120b-1c4e14f28d97@debian.org> <[🔎] 87msbnijpm.fsf@josefsson.org> <[🔎] Pine.BSM.4.64L.2505082103130.1959@herc.mirbsd.org>

Thorsten Glaser <tg@debian.org> writes:

> So, with all the updates, maybe something like this?

I read this now, and think it is an improvement so I'll second this
version too.

I realized that I have one additional generic concern: You claim that
models are a derivate work of their training input.

I don't think this is universally agreed on, or tested in court, and
there are people who heavily push another agenda.  It is somewhat of a
provocative statement.

However, I don't think you actually need to make an argument that his is
true for your proposal.  You don't need to take a stanze on this
provocative question.  People who disagree with this aspect could still
find themselves in agreement with your proposal if it was tweaked a bit.

It is sufficient to claim that

  A) models MAY be considered derivate works of their training inputs.
     We can realize that Debian is not the best organization to decide
     if this is true or not, and likely this will take many years until
     there is any general concsensus in the society about this aspect.
     However what we can claim is that it seems realistic that this MAY
     be the general opinion.

and

  B) a conservative approach is thus to respect the licensing of all
     training inputs, until society have any clear take on A).  This
     allows Debian to continue to work and take what appears to be less
     legal risk, and to more be aligned with the history of supporting
     libre content.

Below is a small diff to achieve this:

OLD:
> 1. A model must be trained only from legally obtained and used works,
>    honour all licences of the works used in training, and be licenced
>    under a suitable licence itself that allows distribution, or it is
>    not even acceptable for non-free. This includes an understanding
>    that “generative AI” output are derivative works of their inputs
>    (including training data and the prompt), insofar as these pass
>    threshold of originality, that is, generative AI acts similar to
>    a lossy compression followed by decompression, or to a compiler.

NEW:
> 1. A model must be trained only from legally obtained and used works,
>    honour all licences of the works used in training, and be licenced
>    under a suitable licence itself that allows distribution, or it is
>    not even acceptable for non-free.
>
>    This assumes an understanding that “generative AI” output may be
>    considered derivative works of their inputs (including training
>    data and the prompt), insofar as these pass threshold of
>    originality.  That is, generative AI acts similar to a lossy
>    compression followed by decompression, or to a compiler.

OLD:
>    Any work resulting from generative use of a model can at most be
>    as free as the model itself; e.g. programming with a model from
>    contrib/non-free assisting prevents the result from entering main.

NEW:
>    Assuming a model output is a derivate work of their training input,
>    and works derived from that model is also a derivate work, any work
>    resulting from model can at most be as free as the model itself;
>    e.g. programming with a model from contrib/non-free assisting
>    prevents the result from entering main.

ADD:
>    We resolve that Debian wants to make conservative licensing choices
>    and not put ourselves into unnecessary legal risk, therefor we
>    propose to behave and act as if that were the case and works
>    derived from training inputs have to consider the license on their
>    inputs.  This aligns with our preference for free software and
>    DFSG-compatible licensing.

I'm short on time so this maybe wasn't the best choice of words, so feel
free to rewrite it if you agree with my principle.

A small comment:

> ⅱ. Any existing package with a “model” inside that already had the
>    very same model before 2020-01-01 has an extra four years time
>    before bugs regarding these models may become release-critical.

Why 2020-01-01?  Couldn't we be generous here and say that if someone
was in the initial Bookworm release then it is eligible for this
exception?

/Simon

Attachment: signature.asc
Description: PGP signature

Reply to:

Follow-Ups:
- [RFCv3] Counter-Proposal -- Interpretation of DFSG on Artificial Intelligence (AI) Models
  - From: Thorsten Glaser <tg@debian.org>

References:
- Re: [RFC] Counter-Proposal -- Interpretation of DFSG on Artificial Intelligence (AI) Models
  - From: Simon Josefsson <simon@josefsson.org>
- Re: [RFC] Counter-Proposal -- Interpretation of DFSG on Artificial Intelligence (AI) Models
  - From: Thorsten Glaser <tg@debian.org>
- Re: [RFC] Counter-Proposal -- Interpretation of DFSG on Artificial Intelligence (AI) Models
  - From: Simon Josefsson <simon@josefsson.org>
- Re: [RFC] Counter-Proposal -- Interpretation of DFSG on Artificial Intelligence (AI) Models
  - From: Thorsten Glaser <tg@debian.org>
- Re: [RFC] Counter-Proposal -- Interpretation of DFSG on Artificial Intelligence (AI) Models
  - From: Simon Josefsson <simon@josefsson.org>
- [RFCv2] Counter-Proposal -- Interpretation of DFSG on Artificial Intelligence (AI) Models
  - From: Thorsten Glaser <tg@debian.org>

Prev by Date: Re: Proposal -- Interpretation of DFSG on Artificial Intelligence (AI) Models
Next by Date: Re: Proposal -- Interpretation of DFSG on Artificial Intelligence (AI) Models
Previous by thread: [RFCv2] Counter-Proposal -- Interpretation of DFSG on Artificial Intelligence (AI) Models
Next by thread: [RFCv3] Counter-Proposal -- Interpretation of DFSG on Artificial Intelligence (AI) Models
Index(es):
- Date
- Thread