Re: Reasonable maximum package size ?

To: debian-devel@lists.debian.org
Subject: Re: Reasonable maximum package size ?
From: Roger Leigh <rleigh@whinlatter.ukfsn.org>
Date: Tue, 05 Jun 2007 21:47:37 +0100
Message-id: <[🔎] 874plm163a.fsf@hardknott.home>
In-reply-to: <[🔎] 20070605155833.GC10266@azure.humbug.org.au> (Anthony Towns's message of "Wed, 6 Jun 2007 01:58:33 +1000")
References: <[🔎] 20070605080907.GA3416@gloin> <[🔎] 20070605092853.GF19396@kunpuu.plessy.org> <[🔎] 20070605155833.GC10266@azure.humbug.org.au>

Anthony Towns <aj@azure.humbug.org.au> writes:

> On Tue, Jun 05, 2007 at 06:28:53PM +0900, Charles Plessy wrote:
>> Le Tue, Jun 05, 2007 at 10:09:07AM +0200, Michael Hanke a ?crit :
>> > My question is now: Is it reasonable to provide this rather huge amount
>> > of data in a package in the archive?
>> many thanks for bringing this crucial question on -devel. In my field, I
>> wish that it would be possible to apt-get install the human genome for
>> instance.
>
> Are either of you going to debconf, or able to point out some example
> large (free?) data sets that should be packaged like this as a test case
> for playing with over debconf?

The NCBI non-redundant database (nr).  Having this packaged and
frequently updated (maybe in volatile) would be fantastic.  There are
also quite a number of other significant (popular) databases used for
bioinformatics, genomics, proteomics and other biological fields which
would be really nice to have in Debian.  Here's a selection:

ftp://ftp.ncbi.nih.gov/blast/db/
ftp://ftp.ncbi.nih.gov/refseq/
ftp://ftp.ncbi.nih.gov/repository/
ftp://ftp.ncbi.nih.gov/pub/taxonomy/

Because these are all in standard formats, it might even be possible
to have updated packages generated and uploaded semi-automatically.
These would be really useful in conjunction with much of the
bioinformatics software already available in Debian, which could make
good use of them if they were put in standardised locations.

As has been mentioned previously, a separate archive section so that
mirrors could skip them would be nice.  Together, all these databases
are eye-wateringly huge.  Especially when uncompressed.

Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux             http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?       http://gutenprint.sourceforge.net/
   `-    GPG Public Key: 0x25BFB848   Please GPG sign your mail.

Attachment: pgpiJeWVGSxui.pgp
Description: PGP signature

Reply to:

Follow-Ups:
- Re: Reasonable maximum package size ?
  - From: Charles Plessy <charles-debian-nospam@plessy.org>
- Large static datasets like genomes (Re: Reasonable maximum package size ?)
  - From: Tim Cutts <timc@chiark.greenend.org.uk>

References:
- Reasonable maximum package size ?
  - From: Michael Hanke <michael.hanke@gmail.com>
- Re: Reasonable maximum package size ?
  - From: Charles Plessy <charles-debian-nospam@plessy.org>
- Re: Reasonable maximum package size ?
  - From: Anthony Towns <aj@azure.humbug.org.au>

Prev by Date: Re: Dependencies on shared libs, take 2
Next by Date: Re: Dependencies on shared libs, take 2
Previous by thread: Re: Reasonable maximum package size ?
Next by thread: Re: Reasonable maximum package size ?
Index(es):
- Date
- Thread