Re: Bits from /me: A humble draft policy on "deep learning v.s. freedom"
Hi Paul,
On 2019-05-21 23:52, Paul Wise wrote:
> Are there any other case studies we could add?
Anybody is welcome to open an issue and add more
cases to the document. I can dig into them in the
future.
> Has anyone repeated the training of Mozilla DeepSpeech for example?
Generally speaking, training is non-trivial and
requires expensive hardware. This fact will clearly
reduce the probability that "someone has tried to
reproduce it".
A real example to illustrate how hard reproducing a
**giant** model is, is BERT, one of the state-of-the-art
natural language representation model that takes
2 weeks to train on TPU at a cost about $500.
Cite:
https://github.com/google-research/bert#pre-training-tips-and-caveats
> Are deep learning models deterministically and reproducibly trainable?
> If I re-train a model using the exact same input data on different
> (GPU?) hardware will I get the same bits out at the end?
Making the training program reproducible is a good practice to everyone
who train / debug neural networks. I've ever wrote a simple deep
learning
framework with only C++ STL and hence trapped into many pitfalls.
Reproducibility is very important for debugging as mathematical
bug is much harder to diagnose compared to code bugs.
I wrote a dedicated section about reproducibility:
https://salsa.debian.org/lumin/deeplearning-policy#neural-network-reproducibility
Reply to: