Organised mirroring of public databases

To: debian-med@lists.debian.org
Subject: Organised mirroring of public databases
From: Steffen Moeller <moeller@inb.uni-luebeck.de>
Date: Fri, 18 Jan 2008 11:04:37 +0100
Message-id: <[🔎] 479079B5.8060001@inb.uni-luebeck.de>

Dear all,

I parked here

http://svn.debian.org/wsvn/debian-med/trunk/community/infrastructure/getData.pl?op=file&rev=0&sc=0

a script which allows the download of external databases in a fairly
straight-forward manner. This is fairly far from perfect but may help to
get ourselves organised towards that said shared aim.

The tool should be extended to allow
 * the addition of databases locally (but hey, since we are on svn and
the databases mostly public, there should not be much of a need to add
databases for oneself only)
 * versioning of databases. Most sites feature past releases for a while
which should be modelled properly.
 * formally specify subsets of the databases, like only mammalian or
human data, if offered as such by upstream maintainers.

We should not (immediately) think of
 * the specification of local mirrors of some public site
 * disk space issues
 * dependencies between downloaded datasets, e.g., the automated rewrite
of EMBL format to FASTA, since such are available online as well. This
would induce ambiguities and possibly also increase utilised bandwith.

So, what database should we address first? The small ones, so I suggest.

Best regards

Steffen

Reply to:

Follow-Ups:
- Re: Organised mirroring of public databases, Re: drug databases
  - From: Charles Plessy <charles-debian-nospam@plessy.org>
- Re: Organised mirroring of public databases
  - From: Charles Plessy <charles-debian-nospam@plessy.org>

Prev by Date: Re: gwyddion is in unstable
Next by Date: drug databases
Previous by thread: [Debian Wiki] Update of "DebianMedImaging" by JanBeyer
Next by thread: Re: Organised mirroring of public databases, Re: drug databases
Index(es):
- Date
- Thread