[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Non-LLM example where we do not in practice use original training data



>>>>> "Ansgar" == Ansgar 🙀 <ansgar@debian.org> writes:

    Ansgar> Hi,
    Ansgar> On Mon, 2025-05-05 at 14:27 -0600, Sam Hartman wrote:
    >> If I wanted to package up my classifier state and distribute it
    >> under a free software license, I think it should be DFSG free.  I
    >> think that to satisfy the DFSG I would need to include  all the
    >> training data I still had and any scripts I used.

    Ansgar> And the training data would have to be under a DFSG-free
    Ansgar> license. I doubt phishing or spam mail comes with proper
    Ansgar> licensing; even ham doesn't do this (what are the license
    Ansgar> terms of this mail?). So if you were required to include
    Ansgar> training data it wouldn't be possible even for fairly boring
    Ansgar> classifiers.

Thank you.  I should have caught that.
I guess even under my proposal option, packaging the classifier might be
tricky.  If I deleted the training data and no longer had it, then I
think under my option, the classifier could be DFSG free.
If I retained the training data, then ftpmaster would need to decide
whether  I as upstream had a more preferred form of modification than
the rest of the world.  (My understanding is we approach upstreams with
well-justified suspicion when they have source-like things that the rest
of the world does not have, and I tried to capture that in my option.)

Attachment: signature.asc
Description: PGP signature


Reply to: