[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Clarification regarding "complete training program" in ML policy draft



Hi list,

And thanks Mo and others for your work on the current Debian ML policy
draft. I have a question regarding the current definition of a "Free
Model" in that draft. It reads [1]:

> Free Model is a model satisfying ALL the following conditions:
> (1) FOSS-Licensed & DFSG-compliant;
> (2) trained from explicitly FOSS-licensed & DFSG-compliant datasets (e.g. for supervised or unsupervised learning) or simulators (e.g. for reinforcement learning), and the dataset is publicly available to anonymous users;
> (3) corresponding training program is present and complete;

What is the intent of the policy's condition 3 in case of a bitrotted
training program?

Story: I recently made use of a pretrained model published along with a
paper describing a certain deep network. I would classify this model as
a definitely Free under conditions 1 and 2. However, the 3rd point was
severely lacking: The authors of the original paper did publish a
complete training program… 5+ years ago. Using frameworks (or versions
of frameworks) that are only partially available, or not easily
runnable, today.

In my particular case, I was able to fix things up and get their
original code working in a reasonable way with some work, but the
experience showed me that more severe cases of bitrot like this are
likely to appear as more models are published in the typical academic
way of dumping some code together with the paper and never touching it
again. I have no objections to this kind of software development for
academic purposes at all, but I do wonder what our position is if the
pretrained model remains useful 5+ years later, but the training
software *was* "present and complete" but no longer is in any reasonable
way. (Or, I guess: is firing up a 5+ year old VM to run the stale code
in a reasonable definition of "present and complete"?)

Any thoughts?


[1] https://salsa.debian.org/deeplearning-team/ml-policy/-/blob/1c467714774ca7c6c47120da91fcb1fd14a160e1/ML-Policy.rst


 Best,
 Gard

Attachment: signature.asc
Description: PGP signature


Reply to: