[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Brief update about software freedom and artificial intelligence



On Mon, 27 Feb 2023 at 08:50, Russ Allbery <rra@debian.org> wrote:
>
> "Roberto A. Foglietta" <roberto.foglietta@gmail.com> writes:
>
> > A totally automatic procedure like web crawling and web indexing
> > re-enter in your example, perfectly. However, the input collection that
> > a ML/AI training system needs is a protectable work because the data
> > should be structured, selected and properly labeled even if these
> > activities are done with rules like it happens using SQL for
> > databases.
>
> Yes, I agree, I think that a trained AI model is a protectable work.
> However, it is not protectable *by you* unless you're the one who wrote
> the model and chose its training.
>
> Therefore, putting a clause in your copyright license saying that if your
> work is incorporated into an AI model, that AI model as a collection is
> covered by some particular license is not really a thing you can do.  The
> best you can do is the standard GPL thing of saying that you don't have to
> license your collection under any particular license, but if you don't,
> you don't have any right to include this specific work.  Maybe that's what
> you were getting at, and I just didn't understand.
>

Dear Russ, I was completely wrong about your ability to contribute to
this discussion because the chance you gave me to confute your thesys
is the best occasion to pave the way to the lawyers that will one day
enforce the A/L/GPLv4 in a court. So, let me explain it in a very
simple and straightforward way:

- A/L/GPLv3 applies to source code and scripts that should be compiled
or run by an interpreter

- the AI/ML training engines use source code and scripts as data, this
might or might not be a fair use, but for sure is a novelty which is
not covered by A/L/GPLv3

- then I decided to protect my projects repositories as database
(collection) in addition to the standard way to protect the code with
a well-known license

- because of the copyright law about databases, if someone creates a
larger database that contains my database or a part of it, then they
have to comply with the license that I choose to protect my project as
a database.

You see, it is a very simple and straightforward concept. The only two
ways to get off this are 1. make unlawful the database copyright law,
2. make a law for which the training input collection is not coverable
by the copyright law. In both cases every employer can bring to their
home a copy of a database or a copy of AI training inputs and share it
with all the rest of the world. Moreover, the 1. includes the 2 while
the 2. would seriously undermine the database copyright law because
every database could be a training set for an AI/ML engine.

Russ, do you agree? :-)

Best regards, R-


Reply to: