Re: Large static datasets like genomes (Re: Reasonable maximum package size ?)

To: Debian Developers <debian-devel@lists.debian.org>
Subject: Re: Large static datasets like genomes (Re: Reasonable maximum package size ?)
From: Steffen Moeller <steffen_moeller@gmx.de>
Date: Sun, 10 Jun 2007 19:38:25 +0200
Message-id: <[🔎] 200706101938.25780.steffen_moeller@gmx.de>
In-reply-to: <B3368746-DBE3-4269-9A67-14A6FADB2AF0@chiark.greenend.org.uk>
References: <[🔎] 20070605080907.GA3416@gloin> <[🔎] 200706091228.29410.steffen_moeller@gmx.de> <B3368746-DBE3-4269-9A67-14A6FADB2AF0@chiark.greenend.org.uk>

On Sunday 10 June 2007 17:20:54 you wrote:
> On 9 Jun 2007, at 11:27 am, Steffen Moeller wrote:
> > Once a (computational) biologist starts a new
> > project, (s)he wants the latest data no matter what and anything
> > older than
> > three months (or a week sometimes) is likely not to be acceptable.
>
> Actually, my experience is that they tend to want diametrically
> opposite things,
> at the same time.
>
> 1)  When starting a new project, they usually want the very latest data.
> 2)  But they usually then want to keep that data static for the
> lifetime of
>      the project.

:o) very true. For 1) I hink that Debian packages for databases do not work. 
They might well work for 2), though. 

But ... how can one directly access a feature on the genome that has no 
accession number because you have just found it across releases of Ensembl?

*  base pairs and chromosome ID does not work across (NCBI) releases
*  centiMorgans are too vague
* distances in bp relative to the nearest genomic marker? Not too bad, 
probably.

The easiest seems indeed to keep the data on which whatever results are 
computed which is diagnosed as behaviour 2).  And 1) is done in order to be 
close to up-to-date at least when the Journal's reviewers inspect the 
work :o)  I actually think that Debian packages can help at least with the 
tools used for the analysis since the updating is technically easy ... unless 
when you have some Perl 5.0.x-specific code this means. Ouch.

Many greetings

Steffen

-- 
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org

Reply to:

Follow-Ups:
- Re: Large static datasets like genomes (Re: Reasonable maximum package size ?)
  - From: Tim Cutts <timc@chiark.greenend.org.uk>

References:
- Reasonable maximum package size ?
  - From: Michael Hanke <michael.hanke@gmail.com>
- Re: Large static datasets like genomes (Re: Reasonable maximum package size ?)
  - From: Steffen Moeller <steffen_moeller@gmx.de>

Prev by Date: Getting package translations into the mirrors (was Re: APT 0.7 for sid)
Next by Date: Re: Large static datasets like genomes (Re: Reasonable maximum package size ?)
Previous by thread: Re: Large static datasets like genomes (Re: Reasonable maximum package size ?)
Next by thread: Re: Large static datasets like genomes (Re: Reasonable maximum package size ?)
Index(es):
- Date
- Thread