[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Distribution of media content together with GPLv2 code in one package?



On Mon, Apr 05, 2010 at 06:20:43PM +0200, Bernhard R. Link wrote:
> * Rudolf Polzer <divVerent@alientrap.org> [100404 12:16]:
> > One argument against supplying "full" source code commonly raised by artists,
> > is that a 3MB large music piece can depend on several gigabytes of "source
> > data", if applying the source requirement recursively.
> 
> That is not much different with classical computer programs. When
> considering everything source that was somehow involved in creating it,
> including all the whole source of every project written by me from which
> I copied some lines, including all the specs and notes scanned in from the
> little pieces of paper I sometimes wrote them and so on, including all
> the core dumps of the editors when writing it and so on
> reaches gigabytes and beyond very fast.
> 
> The only difference with programs is that there is some form we call
> source code and that we usually recognize as source (unless someone did
> something arbitrarily evil, like obfuscating it).

In case of music/sound, it gets way more difficult once there are actually
recorded parts on it.

Although a recorded voice sample of the author's voice very obviously can be
DFSG-free, it cannot really be edited. If you have a collection of such
samples, you cannot easily add one more sample in the same style - as it is
pretty hard to imitate the original author's voice. In fact, in one case
related to the Nexuiz project, we found out the hard way that not even the
original author managed to record a new voice sample that matched the previous
ones.  Because of this, the tutorial now sounds a bit odd at one part, which is
the part where the rerecorded voice sample is - one can clearly hear how the
speaker apparently was in a different "mood" for that sample.

On the other hand, samples created even by commercial speech synthesis are WAY
more editable, simply because they are editable at all (you can make the
speaker say something else). How can THESE be nonfree or contrib then, while
the recorded sample - which often NOBODY, not even the original speaker, can
recreate - should be free and eligible for main? You can edit the synthesized
sample in the very same ways as the recorded sample - by using waveform editing
- but also in more ways (by rerendering it with e.g. different words).

> What the "preferred form of modifiation" is hard to decide - though it
> still is the best definition. But taking it too far never helped.
> Otherwise I've seen more than one perl script where the preferred form
> of modification would then have been the author and the stuff they must
> have smoked when writing this. (Though we usually accept the perl script
> unless it is consciously obfuscated).

Indeed. For code, the definition of source is quite simple. Code typically
consists of a string of characters. Anyone could have typed in these characters
- someone else would be able to type in the very same program. Anyone can
change a line in the program, and there is no reason why that line of the
program should NOT be in the "spirit of" the original program. Simply because
the only part of the author that goes in is his thought, but none of his bodily
properties like race, form of the head, sex, length of vocal cords, etc - in
fact, the length of the fingers has absolutely no influence over the code the
author wrote, other than perhaps over variable naming.

Even better - what counts for a program is not the style of the code, but how
it functions. So the author-specific parts of the code DO NOT EVEN GO INTO THE
BINARY! This is why another author can change part of someone else's program
and it still feels like one single whole - because the differences in style do
not end up in the binary, while e.g. for recorded voice, they are crucial.

Also, in code, there is no such ambiguity about what is generated from what.
The closest to a recorded vs synthesized voice sample you can get is e.g.
writing a parser either manually or using a parser generator. If the author of
the code then does major changes on the result created by the parser generator,
and therefore just used the parser generator to create a "prototype" for him to
work with, then this is very much like voice samples generated by speech
synthesis are typically used. And in such cases, where there is no automated
transform from the parser generator-written code to the code the software uses,
it is generally accepted to just supply the finalized code. In fact, I would bet
that many Debian projects in "main" indeed use hand-modified generated code
without providing the input to the code generator. For example, quite some GUI
code was initially created by some GUI designer software, and then edited and
filled with functionality by hand - and in many such cases, the project file of
the GUI designer - if any - is not even provided. Which there is nothing wrong
with, IMHO, given that even the original author then would do further
modification/reshuffling of dialog elements in the source code, instead of
generating the initial source file again and merging the changes into it.

Rudolf Polzer


Reply to: