Corresponding source for a machine-learning data set (was: Concerns to software freedom when packaging deep-learning based appications.)
Ian Jackson <ijackson@chiark.greenend.org.uk> writes:
> […] In the case of a pretrained neural network, the source code is the
> training data.
>
> In fact, they are probably not redistributable unless all the training
> data is supplied, since the GPL's definition of "source code" is the
> "preferred form for modification". For a pretrained neural network
> that is the training data.
One hopeful sign is that people are addressing the need for the source
form of a machine-learning product. In this case, it's by offering a
version control system customised to machine-learning data.
<URL:https://blog.dataversioncontrol.com/data-version-control-tutorial-9146715eda46>
Is that an approach we can recommend to upstream developers who publish
machine-learning data sets?
--
\ “A child of five could understand this. Fetch me a child of |
`\ five.” —Groucho Marx |
_o__) |
Ben Finney
Reply to: