Re: Concern for: A humble draft policy on "deep learning v.s. freedom"

To: debian-science@lists.debian.org
Subject: Re: Concern for: A humble draft policy on "deep learning v.s. freedom"
From: Mo Zhou <lumin@debian.org>
Date: Thu, 13 Jun 2019 08:50:01 -0700
Message-id: <[🔎] 882c421579846bc2f4eb1ba4f4dfd28b@debian.org>
In-reply-to: <[🔎] 20190613131141.f6kvtqaeyxygwhfm@bubu.plessy.net>
References: <f544829dcd6c0f92ea11cdb25543bdac@debian.org> <[🔎] 20190608184309.GA10146@goofy.osamu.debian.net> <[🔎] eaf1b80cd65eb510fb56703869071784@debian.org> <[🔎] 20190613131141.f6kvtqaeyxygwhfm@bubu.plessy.net>

Hi Charles,

On 2019-06-13 13:11, Charles Plessy wrote:
>> 1. Free datasets used to train FreeModel are not required to upload
>>    to our main section, for example those Osamu mentioned and wikipedia
>>    dump. We are not scientific data archiving organization and these
>>    data will blow up our infra if we upload too much.
> 
> how about storing only the data used to train the version that is
> released in Stable, and keeping this data in a dedicated archive, to
> avoid bloating mirrors ?  There was a thread on debian-project on how to
> use Debian money, and I think that it could be a useful case.

This idea could be mentioned in DL-Policy for future reference. However
I don't see the necessity for the dedicated archive in the near future.
When there are enough amount of DL models in our archive, we can
recall this idea and discuss again.

> For the versions in Unstable and Testing, the role of the package
> maintainer would be to ensure that the data is still available for
> download.

Plus, we can create a new tag "Failed To Train From Scratch"
(FTTFS) similar to the FTBFS tag we use. For models in the main
section FTTFS is unacceptable.

>> 2. It's not required to re-train a FreeModel with our infra, because
>>    the outcome/cost ratio is impractical. The outcome is nearly zero
>>    compared to directly using a pre-trained FreeModel, while the cost
>>    is increased carbon dioxide in our atmosphere and wasted developer
>>    time. (Deep learning is producing much more carbon dioxide than we
>>    thought).
> 
> Optionally, we could even consider re-training the release candidate at
> the approach of the Freeze, for the sake of demonstrating that the
> training process functions well.
> 
> Stable point update might not need to be retrained depending on what the
> patches address.

That's a good idea! I didn't even thought about how DL-Policy works with
our release schedule. Thanks, and I'll merge this point to the document
soon.

Thanks,
Mo

Reply to:

References:
- Concern for: A humble draft policy on "deep learning v.s. freedom"
  - From: Osamu Aoki <osamu@debian.org>
- Re: Concern for: A humble draft policy on "deep learning v.s. freedom"
  - From: Mo Zhou <lumin@debian.org>
- Re: Concern for: A humble draft policy on "deep learning v.s. freedom"
  - From: Charles Plessy <plessy@debian.org>

Prev by Date: Re: Concern for: A humble draft policy on "deep learning v.s. freedom"
Next by Date: Re: Concern for: A humble draft policy on "deep learning v.s. freedom"
Previous by thread: Re: Concern for: A humble draft policy on "deep learning v.s. freedom"
Next by thread: Re: [robotology/idyntree] Use of not initialized memory when using IPOPT < 3.12.11 with MUMPS >= 5.1.0 (#456)
Index(es):
- Date
- Thread