[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Re: Concerns to software freedom when packaging deep-learning based appications.



Hi Russell,

> On Thu, 2018-07-12 at 18:15 +0100, Ian Jackson wrote:
> > Compare neural networks: a user who uses a pre-trained neural network
> > is subordinated to the people who prepared its training data and set
> > up the training runs.
> 
> In Alpha-Zero's case (it is Alpha-Zero the original post was about)
> there is no training data.  It learns by being run against itself. 
> Intel purchased Mobileye (the system Tesla used to use, and maybe still
> does) with largely the same intent.  The training data in that case is
> labelled videos resembling dash cam footage.  Training the neural
> network requires huge amounts of it, all of which was produced by
> Mobileye by having human watch the video and label it. This was
> expensive and eventually unsustainable.  Intel said they were going to
> attempt to train the network with videos produced by game engines.  I
> haven't seen much since the Intel purchased Mobileye however if they
> succeed we are in the same situation - there is no training data.  In
> both cases is is just computers teaching themselves.

To be clear, there are mainly three types of learning: (1) supervised
learning[1]; (2) unsupervised learning[2]; (3) reinforcement learning[3].

AlphaGo-Zero is based on reinforcment learning, but it is a bit special:
we can generate meaningful data in the status space of checkboard.
However, many other present reinforcement learning research use data
that is not easy to generate by algorithm. For example, I remember
some group of people tries to teach the neural network to drive the
car by letting it play Grand_Theft_Auto_V[4].

Supervised learning (requires labeled data) and Unsupervised learning
(requires unlabeled data) often require a bit amount of data.
That data may come along with license restriction. As for reinforcement
learning's data ...... well, I'm confused but I don't want to dig deeper.
 
> The upshot is I don't think focusing on training data or the initial
> weights is a good way to reason about what is happening here.   If Deep
> Mind released the source code for Alpha-Zero anyone could in principle
> reproduce their results if you define their result as I'm pretty sure
> they do: produce an AI capable of beating any other AI on the planet at
> a particular game.  The key words are "in principle" of course, because
> the other two ingredients they used was 250 MW hour of power (a wild
> guess on my part) and enough computers to be able to expend that in 3
> days.

Releasing initial weight doesn't make sense. The initial weights
of the state-of-the-art neural networks are simply drawn from a
certain Gaussian distribution or a certain uniform distribution.
The key to reproduce a neural network is input data + hyper-parameters,
such as the learning rate used during gradient descent.
 
> A better way to think about this is the AI they created is just another
> chess (or Go or whatever) playing game, no different in principle to
> chess games already in Debian.  However, it's move pruning/scoring
> engine was created by a non human intelligence.  The programming
> language that intelligence uses (the weights on a bunch of
> interconnected polynomials) and the way it reasons (which is boils down
> finding the minima of a high dimensional curve using newtons method to
> slide down the slope) is not something human minds are built to cope
> with.  But even though we can't understand them these weights are the
> source, as if you give them to a similar AI it can change the program. 
> In principle the DSFG is fulfilled if we don't discriminate again non-
> human intelligences.
> 
> Apart from the "non-human" intelligence bit none of this is different
> to what we _already_ accept into Debian.  It's very unlikely I could
> have sensible contributions to the game engines of the best chess,
> backgammon or Go programs Debian has now.  I have no doubt I could
> understand the source, but it would take me weeks / months if not years
> to understand the reasoning that went into their move scoring engines. 
> The move scoring engine happens to be the exact thing Alpha-Zero's AI
> (another thing I can't modify) replaces.   In the case of chess at
> least they will have a database of end games they rely on, a database
> generated by brute force simulations generated using quantities of CPU
> cycles I simply could not afford to do.
>
> Nonetheless, cost is an issue.  To quantify it I presume they will be
> able to rent the hardware required from a cloud provider - possibly we
> could do that even now.  But the raw cost of that 250 MW hour of power
> is $30K, and I could easily imagine it doubling many times as it goes
> through the supply chain so as another wild guess you are probably
> looking at $1M to modify the program.  $1M is certainly not "free" in
> any sense of the word, but then the reality no other Debian development
> is free either.  All development requires computers and power which
> someone has to pay for.  The difference is now is merely one of a few
> added noughts, and those noughts exclude almost all of us from working
> on the source.  But I'd be surprised if there isn't a Debian users out
> there who *do* have the means to fiddle with these programs if they had
> the weights and the source used to create them.  Which means anyone
> could work on them if they had the means - but I don't have the means. 
> *shrug*

Yes, cost is a big issue. The point of my original post is exactly
the "time cost". And sometimes, there is hardware cost too.
 
> Which is how I reach the opposite conclusion to Ian.  If Deep Mind
> released Aplha-Zero source code under a suitable licence, plus some
> example neural networks they generated with it (that happen to be bit
> everyone uses) Debian rejecting the example networks as they "aren't
> DFSG" free would be a mistake.  I view one of our roles as advancing
> free software, all free software.  Rejecting some software because we
> humans don't understand it doesn't match that goal.

According to the previous discussion, the biggist two problems are:
 1. license of big data
 2. It's hard for a a user to modify or reproduce a work with pure-free
    software stack.

Seems very hard to solve. The talk in dc12 that pabs pointed out
raised nearly the same problem. Now it's 2018, 6 years have passed
and it appears that there is no progress at this point.

[1] https://en.wikipedia.org/wiki/Supervised_learning
[2] https://en.wikipedia.org/wiki/Unsupervised_learning
[3] https://en.wikipedia.org/wiki/Reinforcement_learning
[4] https://en.wikipedia.org/wiki/Grand_Theft_Auto_V


Reply to: