[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: JPL Planetary Ephemeris DE405



Ben Finney <bignose@debian.org> writes:
>> for example, there is no "source code" for DE405. There is just no
>> "preferred way to edit" for such a database -- these database are
>> created from observation and not thought to be edited by hand.
>
> The freedoms that the recipients are to be granted, to satisfy the DFSG,
> are not limited by what the original distributors imagine.

For (scientific) data, they can't: DFSG requires "source code". To take
its definition in the Linux Information Project (just my laziness; taken
from Wikipedia):

| Source code (...) is the version of software as it is originally
| written (i.e., typed into a computer) by a human in plain text (i.e.,
| human readable alphanumeric characters).

Not let's take the detailed measurement data of the earth rotation in
JPL405. Does it meet these criteria? Is it written originially written
by a human?

Clearly not. It is generated.

Is it the preferred form of the work for making modifications (GPL def)?

Clearly not. The preferred form would be to repeat the observations.

So, what could be the "source code"? The data were compiled from a huge
number of observations from many scientific fields (radio astronomy and
satellited observations to name two of them), each of them connected
with a complex analysis. Is the raw data input in each of these
observations the "source code"? Obviously also not. "Raw input" for
radio astronomy may f.e. mean the realtime data stream from the antenna
before applying a filter. Even if you had this and would count this as
input; it would require to repeat the complete analysis chain within
Debian -- something that cannot work at all: we don't have enough
manpower to gather and repeat all those analysis steps ourself, and we
don't have enough machine power. And still then you come to the point
where you can't change the data because you can't repeat a transient
observation.

So: science data has no source code.

Therefore, if we want to have scientific data in Debian, we have to be
"creative" here, and think of a good, pragmatic analogon to "source
code". To formulate it as a test: Imagine that a scientist found a
better way to determine the data. Is he legally and factically able to
replace the data with his own? This would keep us free from a f.e. JPL
lock-in: even when JPL is closed in some future (or does not want to
provide those data anymore), someone else may step in and (with the
required effort) take over.

This would require two things (instead of DFSG §2: Inclusion of source code)

1. it must be legal to replace those data 

2. the data must have a form that allows a scientist to replace them
   - properly documented
   - in a format that is writable for the scientist (with Free Software)

It is, however, not required that the format is ASCII. F.e. an SQLite
database or an (astronomy) FITS file would be fine as well.

Such a rule is harder for the ftp-masters, since they finally need to
trust some expert of whether the data is "properly documented", but I
see no other way to have scientific data in Debian.

We could ofcourse even question if we want to. But if we decide to
generally not include scientific data, we will loose not only a large
number of science applications (and games etc. based on scientific
observations). Also the use of scientific data in "ordinary" software is
strongly increasing.

> If a recipient of Debian gets it into their head, for any reason or no
> reason, to modify and re-distribute the work, the Debian Social Contract
> promises that they are permitted to do that; so the work's copyright
> license must permit that.

Copyright is possible only for creative works.

Science itself is for sure a creative process; however the creative
outcome of science are the papers, and they are copyright protected.

The possibilities to copyright (scientific) data is very limited: data
as such are not copyrightable at all (or who has the copyright on the
electron mass?). Collections of data ("databases") are copyright
protectable in the EU only when their selection or arrangement is
"creative", which excludes scientific databases: their selection and
arrangement there is done by formal criteria, not by creativity. For the
US I don't know; it seems however that there is no database protection
at all.

With no copyright protection, there are no copyright licenses,
independently of what the distributor of a database says.

>> So, it is just wrong to apply software licenses to databases like DE405.
>
> That's contrary to the position of the FTP masters, and contrary to the
> Debian Social Contract §1.

It would be great if the ftp-masters could contribute their view to this
discussion. I also don't see why this is contrary to "free redistribution".
Please explain.

Best regards

Ole


Reply to: