Re: Proposal -- Interpretation of DFSG on Artificial Intelligence (AI) Models
BTW, this also inspired me to read up on the EU Artificial
Intelligence Act ( https://artificialintelligenceact.eu/ ) and I
noticed a very relevant notice in that text - Art 53 (
https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:32024R1689#art_53
) talks about providers of AI having to have provisions to comply with
Article 4(3) of Directive (EU) 2019/790 (
https://eur-lex.europa.eu/eli/dir/2019/790/oj/eng#art_4 ) and that is
a provision that specifies that (in the context of text and data
mining) there should be a way for rights holders to withhold their
content from the datamining, possibly by specifying a machine-readable
flag for content published online.
But the key here is that this whole article 4 provides a very explicit
exception to copyright protections for the purposes of text and data
mining. And the EU AI Act very explicitly references this exception as
applicable for AI training purposes.
It is also nice to see that EU AI Act explicitly highlights open
source AI models and provides them with simplified and preferential
rules.
On Tue, 6 May 2025 at 01:26, Aigars Mahinovs <aigarius@gmail.com> wrote:
>
> This one is much simpler. Maybe because the lawyers being used are not too good.
>
> https://www.courtlistener.com/docket/67538258/tremblay-v-openai-inc/
>
> Authors claim a lot of stuff, basically a generic shotgun of copyright claims, but all secondary claims get dismissed by the court at pre-trial stage due to bad legal reasoning and failing to detail or prove any actual wrongdoing. And specifically a claim that all outputs from a LLM are derived works of all inputs is dismissed based on already decided case law.
>
> Only the claim of direct copyright infringement of using a text of a book in the training process of a model still stands to avait the actual trial. And there OpenAI is citing a lot of good reasons why that does not constitute distribution at all and why the result of the work is transformative and thus is protected by fair use. Just the fact of accessing some data at some point does not create copyright infringement. The whole lawsuit is very sloppy IMHO, IANAL.
>
> On Tue, 6 May 2025 at 00:10, Bill Allombert <ballombe@debian.org> wrote:
>>
>> Le Mon, May 05, 2025 at 11:44:30PM +0200, Aigars Mahinovs a écrit :
>> > On Sun, 4 May 2025 at 17:30, Wouter Verhelst <w@uter.be> wrote:
>> >
>> > > It is incorrect, because the New York Times did in fact file suit
>> > > against Microsoft, OpenAI, and other parties related to copyright
>> > > infringement of their large library of news articles in creating
>> > > ChatGPT[1]. The case is still in court.
>> > >
>> > > [1]
>> > > https://www.courtlistener.com/docket/68117049/the-new-york-times-company-v-microsoft-corporation/
>> >
>> >
>> > Thanks for this link, it has been a very interesting read.
>>
>> Another one:
>>
>> https://arstechnica.com/information-technology/2023/07/book-authors-sue-openai-and-meta-over-text-used-to-train-ai/
>> https://arstechnica.com/tech-policy/2025/02/meta-torrented-over-81-7tb-of-pirated-books-to-train-ai-authors-say/
>>
>> Cheers,
>> Bill.
>>
>
>
> --
> Best regards,
> Aigars Mahinovs
--
Best regards,
Aigars Mahinovs
Reply to: