[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Non-LLM example where we do not in practice use original training data



Sam Hartman <hartmans@debian.org> writes:

> I guess even under my proposal option, packaging the classifier might be
> tricky.  If I deleted the training data and no longer had it, then I
> think under my option, the classifier could be DFSG free.

I realize that we have made exceptions in the past for files where the
original source has been lost so the form in which they exist, although
quite clearly not the original source, is now de facto the preferred form
of modification. (Sam may be thinking of the same PDF files that I'm
thinking of.)

However, I am very leery about extending that exception to cases where
people are intentionally creating that situation by deleting the input
data on purpose. It's one thing when the source has been lost but the
output document is still important for historical purposes. I think that
falls into the category of cases where humans are not computers and we do
not have to blindly follow rules without considering their underlying
purpose. But extending that case to cases where people are intentionally
discarding the training data so that they don't have to produce it is a
step too far down a slippery slope for me.

I think we would think very hard before accepting a compiled binary into
the archive whose original source had been lost, and would be very
unlikely to accept a compiled binary where the original source was
intentionally deleted so as to make it DFSG-free. The "preferred form of
modification" test is not, to me, the only test for compliance with the
DFSG. The point of the DFSG is more than to *only* put everyone on an
equal footing. There is also a straightforward desire to actually have
meaningful and useful source code, without which free software is kind of
pointless.

-- 
Russ Allbery (rra@debian.org)              <https://www.eyrie.org/~eagle/>


Reply to: