Re: Re: Concerns to software freedom when packaging deep-learning based appications.
Hi Russell,
> On Thu, 2018-07-12 at 18:15 +0100, Ian Jackson wrote:
> > Compare neural networks: a user who uses a pre-trained neural network
> > is subordinated to the people who prepared its training data and set
> > up the training runs.
>
> In Alpha-Zero's case (it is Alpha-Zero the original post was about)
> there is no training data. It learns by being run against itself.
> Intel purchased Mobileye (the system Tesla used to use, and maybe still
> does) with largely the same intent. The training data in that case is
> labelled videos resembling dash cam footage. Training the neural
> network requires huge amounts of it, all of which was produced by
> Mobileye by having human watch the video and label it. This was
> expensive and eventually unsustainable. Intel said they were going to
> attempt to train the network with videos produced by game engines. I
> haven't seen much since the Intel purchased Mobileye however if they
> succeed we are in the same situation - there is no training data. In
> both cases is is just computers teaching themselves.
To be clear, there are mainly three types of learning: (1) supervised
learning[1]; (2) unsupervised learning[2]; (3) reinforcement learning[3].
AlphaGo-Zero is based on reinforcment learning, but it is a bit special:
we can generate meaningful data in the status space of checkboard.
However, many other present reinforcement learning research use data
that is not easy to generate by algorithm. For example, I remember
some group of people tries to teach the neural network to drive the
car by letting it play Grand_Theft_Auto_V[4].
Supervised learning (requires labeled data) and Unsupervised learning
(requires unlabeled data) often require a bit amount of data.
That data may come along with license restriction. As for reinforcement
learning's data ...... well, I'm confused but I don't want to dig deeper.
> The upshot is I don't think focusing on training data or the initial
> weights is a good way to reason about what is happening here. If Deep
> Mind released the source code for Alpha-Zero anyone could in principle
> reproduce their results if you define their result as I'm pretty sure
> they do: produce an AI capable of beating any other AI on the planet at
> a particular game. The key words are "in principle" of course, because
> the other two ingredients they used was 250 MW hour of power (a wild
> guess on my part) and enough computers to be able to expend that in 3
> days.
Releasing initial weight doesn't make sense. The initial weights
of the state-of-the-art neural networks are simply drawn from a
certain Gaussian distribution or a certain uniform distribution.
The key to reproduce a neural network is input data + hyper-parameters,
such as the learning rate used during gradient descent.
> A better way to think about this is the AI they created is just another
> chess (or Go or whatever) playing game, no different in principle to
> chess games already in Debian. However, it's move pruning/scoring
> engine was created by a non human intelligence. The programming
> language that intelligence uses (the weights on a bunch of
> interconnected polynomials) and the way it reasons (which is boils down
> finding the minima of a high dimensional curve using newtons method to
> slide down the slope) is not something human minds are built to cope
> with. But even though we can't understand them these weights are the
> source, as if you give them to a similar AI it can change the program.
> In principle the DSFG is fulfilled if we don't discriminate again non-
> human intelligences.
>
> Apart from the "non-human" intelligence bit none of this is different
> to what we _already_ accept into Debian. It's very unlikely I could
> have sensible contributions to the game engines of the best chess,
> backgammon or Go programs Debian has now. I have no doubt I could
> understand the source, but it would take me weeks / months if not years
> to understand the reasoning that went into their move scoring engines.
> The move scoring engine happens to be the exact thing Alpha-Zero's AI
> (another thing I can't modify) replaces. In the case of chess at
> least they will have a database of end games they rely on, a database
> generated by brute force simulations generated using quantities of CPU
> cycles I simply could not afford to do.
>
> Nonetheless, cost is an issue. To quantify it I presume they will be
> able to rent the hardware required from a cloud provider - possibly we
> could do that even now. But the raw cost of that 250 MW hour of power
> is $30K, and I could easily imagine it doubling many times as it goes
> through the supply chain so as another wild guess you are probably
> looking at $1M to modify the program. $1M is certainly not "free" in
> any sense of the word, but then the reality no other Debian development
> is free either. All development requires computers and power which
> someone has to pay for. The difference is now is merely one of a few
> added noughts, and those noughts exclude almost all of us from working
> on the source. But I'd be surprised if there isn't a Debian users out
> there who *do* have the means to fiddle with these programs if they had
> the weights and the source used to create them. Which means anyone
> could work on them if they had the means - but I don't have the means.
> *shrug*
Yes, cost is a big issue. The point of my original post is exactly
the "time cost". And sometimes, there is hardware cost too.
> Which is how I reach the opposite conclusion to Ian. If Deep Mind
> released Aplha-Zero source code under a suitable licence, plus some
> example neural networks they generated with it (that happen to be bit
> everyone uses) Debian rejecting the example networks as they "aren't
> DFSG" free would be a mistake. I view one of our roles as advancing
> free software, all free software. Rejecting some software because we
> humans don't understand it doesn't match that goal.
According to the previous discussion, the biggist two problems are:
1. license of big data
2. It's hard for a a user to modify or reproduce a work with pure-free
software stack.
Seems very hard to solve. The talk in dc12 that pabs pointed out
raised nearly the same problem. Now it's 2018, 6 years have passed
and it appears that there is no progress at this point.
[1] https://en.wikipedia.org/wiki/Supervised_learning
[2] https://en.wikipedia.org/wiki/Unsupervised_learning
[3] https://en.wikipedia.org/wiki/Reinforcement_learning
[4] https://en.wikipedia.org/wiki/Grand_Theft_Auto_V
Reply to: