[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Concern for: A humble draft policy on "deep learning v.s. freedom"



Hi Charles,

On 2019-06-13 13:11, Charles Plessy wrote:
>> 1. Free datasets used to train FreeModel are not required to upload
>>    to our main section, for example those Osamu mentioned and wikipedia
>>    dump. We are not scientific data archiving organization and these
>>    data will blow up our infra if we upload too much.
> 
> how about storing only the data used to train the version that is
> released in Stable, and keeping this data in a dedicated archive, to
> avoid bloating mirrors ?  There was a thread on debian-project on how to
> use Debian money, and I think that it could be a useful case.

This idea could be mentioned in DL-Policy for future reference. However
I don't see the necessity for the dedicated archive in the near future.
When there are enough amount of DL models in our archive, we can
recall this idea and discuss again.

> For the versions in Unstable and Testing, the role of the package
> maintainer would be to ensure that the data is still available for
> download.

Plus, we can create a new tag "Failed To Train From Scratch"
(FTTFS) similar to the FTBFS tag we use. For models in the main
section FTTFS is unacceptable.

>> 2. It's not required to re-train a FreeModel with our infra, because
>>    the outcome/cost ratio is impractical. The outcome is nearly zero
>>    compared to directly using a pre-trained FreeModel, while the cost
>>    is increased carbon dioxide in our atmosphere and wasted developer
>>    time. (Deep learning is producing much more carbon dioxide than we
>>    thought).
> 
> Optionally, we could even consider re-training the release candidate at
> the approach of the Freeze, for the sake of demonstrating that the
> training process functions well.
> 
> Stable point update might not need to be retrained depending on what the
> patches address.

That's a good idea! I didn't even thought about how DL-Policy works with
our release schedule. Thanks, and I'll merge this point to the document
soon.

Thanks,
Mo


Reply to: