[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: data sets and/or access to data sets



just few cents.  In the domain of neuroimaging we are also confronted
with the problem of distributing data.  Various aspects are relevant to
this question if someone is to package data "statically" (instead of
fetching via some data-sharing framework) into a proper Debian
package:

1.  with a classical Debian package large sizes of data get
  duplicated both in source and binary packages.  

  Although could be overcome via some means, for our domain of interest,
  http://neuro.debian.net/datasets.html  provides data in both binary
  and source packages with the idea, that non-Debian users can still
  simply fetch .orig.tar.gz if they need to get ahold of the data, e.g.
  separate tarballs per subject from
  http://neuro.debian.net/debian/pool/main/h/haxby2001/

2.  what is the appropriate license for data ;)  in quite a few 
   jurisdictions data is not copyrightable per se at all thus plain common
   licenses tailored toward software are not appropriate (even CC [1]).  EU
   has SUI generis database rights while there is no similar mechanism in
   the states afaik, suggesting the necessity of license terms
   addressing such differences

   so while releasing/packaging data viable description of terms
   should be attached to be appropriate in different jurisdictions, e.g.,
   as recommended by Hendrik Weimer on debian-legal [2] -- ODC Public
   Domain Dedication and Licence (PDDL) [3].


[1] http://bibwild.wordpress.com/2008/11/24/creative-commons-is-not-appropriate-for-data/
[2] http://lists.debian.org/debian-legal/2011/01/msg00049.html
[3] http://www.opendatacommons.org/licenses/pddl/1.0/

On Tue, 15 Feb 2011, Andreas Tille wrote:

> Hi Scott,

> I think your idea is quite reasonable in principle.  As far as I
> understood (but I did not dived into this) the getData effort[1] is one
> step into this direction and the to be soon uploaded package Biomaj does
> something that might be helpful as well.

> Regarding to actually buold packages:  There were several ideas in the
> past to have some data.debian.org archive which contains large data sets
> where the packages you would suggest probably would fit into.  However,
> to the best of my Knowledge this was not yet implemented for practical
> use.

> Do we want to try another shot onto a Google Summer of Code project
> into this direction?

> Kind regards

>     Andreas.

> [1] http://wiki.debian.org/getData
-- 
=------------------------------------------------------------------=
Keep in touch                                     www.onerussian.com
Yaroslav Halchenko                 www.ohloh.net/accounts/yarikoptic


Reply to: