[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [EMBOSS] Files included in EMBOSS but licensed ...



Quoted in full for the benefit of the debian-med list who missed the original posting

On 29/07/2011 21:35, Adam Sjøgren wrote:
On Fri, 29 Jul 2011 09:39:46 +0100, Peter wrote:

It might make things clearer if someone from Debian could explain:

(I am not from Debian, but here is my take on it anyway:)

(a) why a Creative Commons licence is an issue for you

One of the fundamental software freedoms is the freedom to change the
software¹.

The Debian Free Software Guidelines' definition of free software
includes this freedom².

So the "No Derivatives" variants of the Creative Commons licenses aren't
free by the DFSG definition.

(The GNU Free Documentation License on documents with invariant sections
is considered non-free by DFSG-standards as well, even if the invariant
sections are things that nobody would want to change.)

When a project of volunteers packages 29000+ thousand packages, I think
making a judgement call on whether it is okay that the license of a
couple of files does not live up to the guidelines is neigh impossible.

The answer to "Why would you want to?" is, because you might need to.

It is more obvious with programs and code than it is with database
entries, granted - but I guess the equivalent problem would be that the
licensor didn't want to fix a problem in such a database, and that
problem made the programs using it malfunction. It would be a pain if
you weren't allowed to fix the problem and distribute the fixed data
yourself, say, if "upstream" didn't want to include the fix for some
reason or another; maybe they happened to turn sour on the world/you -
stranger things have happened.

So, nobody is probably ever going to exercise that freedom in this
specific case, I think, but ignoring some of the freedoms in special
cases is infeasible for a project such as Debian.

This is just me trying to explain how I understand it, so take it with a
grain of salt, and swing by debian-legal³ for the experts.

A specific example might help. About 5 years ago a release of the UniProt database (as plain text files) broke the Wisconsin (GCG) sequence analysis package. They introduced extremely long lines in a data file that everyone assumed was only maximum 80 characters.

As GCG was closed source, the fix required a change to the UniProt files to either wrap or truncate the 'offending' records.

The fix was not to distribute a change to the data of course, but to write and distribute a simple perl script that wrapped the long records.

That was not a licensing issue - the content stays the same, the format is changed, no changed data is distributed. But it does illustrate that the database licensing does not prevent 'fixing' a database.

(b) why you appear to consider a copy of a whole or part of a public
biological database as part of an "operating system"

They are part of a package which is included in the Debian GNU/Linux
free operating system.

I expect there are many problems that arise if data ... and documentation ... are considered to be software. For EMBOSS we didn't officially specify a license for the documentation but other packages probably do. It still worries me that some of our documentation files officially include GPL licensed (EMBOSS) source code but I did not like any of the alternative documentation licenses.

(I personally think it would make sense to change to a Creative Commons
license that allows derivative works - Uniprot and others are going to
be the canonical source for the data anyway, so nothing will be lost by
them by doing that, as far as I can see.)

Unlikely. The no-derivatives version is specifically there to prevent derivatives - for example Debian distributing a modified UniProt without permission.

The ontologies are similar, but do allow for the use case of importing terms from one ontology into another if the ontology name is changed (and preferably if cross-references to the original are provided). Again, the need is to protect the integrity of the original ontology content so references to a GO term or a UniProt entry are clearly defined.

This is essential for many of the public bioinformatics databases. Data and software are not the same in this context. I am curious whether documentation licensing raises any issues.

Just my 2c worth

Peter Rice
EMBOSS Team






Reply to: