[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Concerns to software freedom when packaging deep-learning based appications.



Hi folks,

I just noticed that one of us tries to package deep-learning based
application[1], specifically it is AlphaGo-Zero[2] based. However, this
raised my concern about software freedom. Since mankind relys on artificial
intelligence more and more, I think I should raise this topic on -devel.


However, before pointing out the problem that I concern about, I should
first explain some terms in this field:

 (1) For those who don't know what "deep-learning" is, please think of it
     as a subject which aim to solve problems such as "what object
     does this photo present to us? a cat or a bike?"[5], which cannot
     be solved by using traditional algorithms.

 (2) For those who don't know what a "pretrained (deep/convolutional) neural
     network" is, please think of it as a pile of matrices, or simply a
     large, pre-computed array of floating numbers.

 (3) CUDA Deep Neural Network library (cuDNN)[4] is NVIDIA's **PROPRIETARY**,
     stacked on CUDA, and requires NVIDIA GPU exclusively.


My core concern is:

  Even if upstream releases their pretrained model under GPL license,
  the freedom to modify, research, reproduce the neural networks,
  especially "very deep" neural networks is de facto controled by
  PROPRIETARIES.

Justification to the concern:

  1. CUDA/cuDNN is used by nearly all deep-learning researchers and
     service providers.

  2. Deep neural networks is extremely hard to train on CPU due to
     the time cost. By leveraging cuDNN and powerful graphic cards,
     the training process can be boosted up to more than 100x times
     faster. That means, for example if a neural network can be trained 
     by GPU in 1 day, then the same thing would take a few months on CPU.
     (Google's TPU and FPGA are not the point here)

  3. A meaningful "modification" to neural network often refers to
     "fine-tune", which is a similar process to "training".
     A meaningful "reproduce" of neural network often refers to
     "training starting from random initialization".

  4. Due to 1. 2. and 3. , the software freedom is not complete.
     In a pure freesoftware environment, that work cannot be reproduced,
     modified, or even researched. Although CPU indeed can finish the
     same work in several months or several years, but that's way
     too much reluctant.

     In this way, the pretrained neural network is not totally "free"
     even if it is licenced under GPL-*. None of the clauses in
     GPL is violated, but the software freedom is limited.


I'd like to ask:

 1. Is GPL-licended pretrained neural network REALLY FREE? Is it really
    DFSG-compatible?

 2. How should we deal with pretrained neural networks when some of us
    want to package similar things? Should it go contrib?



[1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=903634
[2] https://en.wikipedia.org/wiki/AlphaGo_Zero
[3] https://www.debian.org/social_contract
[4] https://developer.nvidia.com/cudnn
[5] More examples about what deep learning can do:
    (2) tell you what objects a picture presents, and where are they.
    (3) tell you what's in a picture in a complete English word.
    (4) translate one language into another, such as zh_CN -> en_US
    (5) fix your english grammar error https://arxiv.org/pdf/1807.01270.pdf
    (6) ...

Attachment: signature.asc
Description: PGP signature


Reply to: