[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Quick Bits: ML-Policy Updates; Recent Progress of Deep Learning Packaging



Hi fellow devs,

Recently an increasing about of preprint papers studying COVID-19 with
machine leanring techniques have been published, which reminded me of
the importance of machine learning techniques.

 (1) Updates to ML-Policy [1]

	 Sidenote: Projects such as https://github.com/IliasPap/COVIDNet
	 directly posed questons on me, that are exactly what ML-Policy is
	 trying to solve.

 (2) Updates to Deep Learning Related Packages

     Sidenote: Some fellow devs of debian-med and debian-science believe
	 that deep learning software are substantially meaningful as a part
	 of the COVID-19 hackathon ... So ...

Updates to ML-Policy
--------------------

Just tagged the v0.2 release, but it is still an experimental UNOFFICIAL
document. The most significant changes of this document after the last
time we discuss about it on this list, is simplification.

Detailed Changes:
 1. Removed the definition of "Sourceless Model" and "critical tasks".
    They will make things more complicated.
 2. The previous version only discuss about the pre-trained models. In
    this version we also discuss about the data, and the output of the
	models.
 3. Rewrote the definition of model reproducibility.
 4. Introduced a "whitelist" policy for ToxicCandy models. So that we
    won't kill things like the input methods.
 5. Introduced a "Combination" policy in case multiple models are used
    together.
 6. Introduced a "Tainting" policy in case free and non-free stuff are
    mixed in the same pipeline.

The document is longer, actually, compared to the previous version. But
the newly introduced ideas follow the convention and common sense to
some extent, which should not make them harder to understand.
   
This policy is not complete. Advices are welcome.

BTW, as you might have noticed, the ml-policy.git repo has been
transferred to deeplearning-team namespace.

Updates to Packages
-------------------

Status of the debianization of top-2 deep learning frameworks, i.e.
tensorflow and pytorch:

tensorflow: it is hard to circumvent the only officially supported
  build system bazel. We have one developer currently working on the
  bazel packaging.

pytorch: Having been greatly motivated by the attitude of pytorch
  upstream[2] and their quick response to pull requests, I finished
  packaging the necessary pytorch dependencies in lightning speed and
  uploaded them to the NEW queue. Currently I'm working on making the
  upstream build system distro-friendly. Related works are all tracked
  in[2]. My local builds are very successful.

---

[1] https://salsa.debian.org/deeplearning-team/ml-policy/-/blob/master/ML-Policy.pdf
[2] https://github.com/pytorch/pytorch/issues/14699

Mo,
On behalf of Debian Deep Learning Team


Reply to: