[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

ML-policy in action?



(please CC me in replies, as i'm not subscribed)

hi,

as the maintainer of the "opus" package (a popular low-latency audio codec) [0], i'm currently facing a package that started to include a ML-model in their latest and greatest release - or to put it in upstream's words: "Opus [1.5.1] gets a Serious Machine Learning Upgrade" [1].

my research so far shows that:
- upstream's git repository contains the source code (mostly torch) to train the model(s) [2] - upstream's git repository contains a list of links to the training datasets [3]. i've checked all the listed datasets and they are free (CC-BY, CC-BY-SA). i estimate the total size of the compressed training data to be about 80GB) - the released source tarball only contains some generated C-source code files that contain the weights generated by the model

so i think that the package itself is Free, although i'm still communicating with upstream to have them document the entire training pipeline (e.g. I got some some vague "the training data might need manual assembling" on IRC, which i'm hoping they will document)

so anyhow, this seems to be an obvious package to apply the ML-policy.
since this is my first package where i try to apply ML-policy, i thought i'd learn from examples. unfortunately, codesearch.debian.net does not return anything for "path:debian/rules reproduce-model" or "path:debian/rules get-external-data".

so I wonder, whether there are any packages that already apply §4.5 "Reproduce Rules" and §4.7 "External Data" in the archive?

mgfad
IOhannes


[0] https://tracker.debian.org/pkg/opus
[1] https://opus-codec.org/demo/opus-1.5/
[2] https://gitlab.xiph.org/xiph/opus/-/tree/v1.5.1/dnn/torch
[3] https://gitlab.xiph.org/xiph/opus/-/blob/v1.5.1/dnn/datasets.txt

Attachment: OpenPGP_0xB65019C47F7A36F8.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature


Reply to: