[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Packaging special packages



Hi Ole,

From what I understand there are really two questions:

- should we upload small "pipeline" packages?

NB Julian's comment on "They are not useful for the general public" imho
should not of concern here -- probably major part of the Debian
archive is not useful for the general public and that is what makes
Debian Debian -- it is useful for everyone!

  if the concern here is them being useless without data and small, then
  probably not worthwhile uploading into main unless data gets available
  there as well.  If it could (theoretically) be used without data
  but too small on its own -- may be worth bundling pipelines into
  a single package?

  some of our (NeuroDebian) packages also require vast data but in
  general usable without it and large enough, so we do ship those tools
  in Debian proper while providing 'complimentary' data packages
  from a dedicated 'data' suite of neurodebian.

- Data packages:  if it is just 2M I probably would try to upload
  that tiny pipeline + data in a single package (might need 2 pkgs if
  pipeline is of arch 'any').  It should fulfill the requirement "to no
  pollute archive with tiny packages" since data would add the weight ;)

  If it is 100M -- indeed a new resolution would be needed since archive
  atm is not welcoming data packages per se.  It might then go to
  contrib as a 'downloader' package.  If pipeline could be used without
  data, I would "Recommend" data from contrib (forgot now if that is
  legit according to the policy, if not -- Suggest)

On Wed, 11 Sep 2013, Olе Streicher wrote:

> Hi Science packagers,

> Some time ago, there was a small discussion between me and Julian Taylor
> about the packaging of a special package [1], which was also forwarded
> to this list.

> However, I would like to re-start this discussion now and get some more
> opinions since the problem may exist for a couple of scientific software
> packages:

> The European Southern Observatory runs one telescope (VLT) in Chile
> which uses several "instruments" (camera, spectrograph etc.) to get the
> data. The data processing for these instruments is very specific and is
> done in so-called "pipelines", from which about 20 exist [2]. Their
> structure is quite similar, so once the first pipeline is packaged, the
> rest doesn't require much effort. The dependencies of the pipelines are
> already packaged in Debian.

> However, there is one critics, that was brought up by Julian: every
> pipeline can be used only for one specific instrument on this unique
> telescope. If one doesn't have observational data from the VLT for that
> specific instrument, the pipeline is worthless. And usually all
> observations are done by a specific request of a scientist to fullfill
> his needs.

> On the other hand, these data become freely available for everyone [3],
> allowing (and ecouraging) the scientific re-use by the whole
> community. Especially for scientists without direct access to the
> telescope, this is an excellent opportunity for scientific work. I think
> that this perfectly fits into the best goals of the "Debian ethics".

> Typically, binary packages of the pipeline code would have a footstamp
> of  <~ 1MB. However, the pipelines are usually accompanied with a
> calibration data set, whose size ranges from some 100 kB to ~100 MB. The
> calibration files are needed to actually run the pipeline with some
> scientific result. I think it would be in any case too much to put all
> these data onto Debian mirrors, just for the few astronomers out there.

> So, having a package that downloads and installs the calibration data
> would be the best here, right? But this would make the packages no
> longer self-contained. Would that be a legal problem for a Debian
> package in main?

> What do you think: is it worth to upload these "pipeline" packages to
> Debian? Or is it better to keep them in some personal repository?

> Best regards

> Ole

> [1] http://bugs.debian.org/709330
> [2] http://www.eso.org/sci/software/pipelines/
> [3] http://www.eso.org/UserPortal
-- 
Yaroslav O. Halchenko, Ph.D.
http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org
Senior Research Associate,     Psychological and Brain Sciences Dept.
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik        


Reply to: