[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: data sets and/or access to data sets



The web interface will have togo to non-free 'cause it depends on several GWT libraries which are not packaged for Debian so we need to provide the Jar files with the softs. The "core" e.g. the main program that is called by the web interface will be in free.

Core can also be used without the web interface, but interface is definitly a plus....

Olivier

Le 2/16/11 5:53 PM, Scott Christley a écrit :
Hello,

I haven't followed the emails for biomaj packaging closely but I saw a number of mentions that it may have to go into non-free?  It does look like an interesting technology that already has the capability to collect data from various servers.

Scott

On Feb 16, 2011, at 4:06 AM, Olivier Sallou wrote:

Hi,
there is the BioMAJ tool, currently in packaging for Debian.
Biomaj takes a property file that describes a remote data bank and the post-processes to apply on the downloaded data.
You can get more info at http://biomaj.genouest.org.
Biomaj takes in charge the download, the postprocesses and provide an interface (web or command-line) to get bank status (directory where available, additional info etc...)

It is used in several bioinformatics labs worldwide.

What could be done for what you expect as to provide the Biomaj property files as packages with postprocess dependencies (a depends Bowtie for example). Package would put bank properties file and possible scripts in Biomaj.

Once Biomaj is officially packaged in Debian, I plan to package the existing property files the community have developped around the tool (available on the web site).
Those propery files would be an add-on package to Biomaj (for the moment we have to manually put those in Biomaj bank directory).

Regards

Olivier

Le 2/15/11 9:24 PM, Scott Christley a écrit :
Hello,

I wonder if anybody has thought about providing large data sets, like genomes, microarray data, etc. into debian "packages" in a way that makes it easy for users to get those data sets onto their machine, making it easier to use various tools?  I can think of many great ways this would be useful.

For example, If a user had high-throughput sequencing data that they need to align to a genome.  Now there is a tool available in debian called bowtie that will do the job but the user needs to 1) download the genome and 2) generate the bowtie index.  Wouldn't it be great if you just type:

apt-get install bowtie-human-genome-index

which installed the genome and the pre-built indexes, then they could just run bowtie directly.

Or another example is wanting to do your own BLAST searches, why not a package that has the BLAST database indexes:

apt-get install BLAST-human-genome

What is nice is all of these data sets could be maintained in a global directory space, like /usr/share, so all of the tools could share this space preventing duplication, and made available to all users on the system.  Right now every user has to figure out how to manage their data individually which can be difficult for biologists.

What do you think?

Scott


--
gpg key id: 4096R/326D8438  (pgp.mit.edu)
Key fingerprint = 5FB4 6F83 D3B9 5204 6335  D26D 78DC 68DB 326D 8438



--
To UNSUBSCRIBE, email to debian-med-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: [🔎] 4D5BA1BA.4060405@irisa.fr">http://lists.debian.org/[🔎] 4D5BA1BA.4060405@irisa.fr



--
gpg key id: 4096R/326D8438  (pgp.mit.edu)
Key fingerprint = 5FB4 6F83 D3B9 5204 6335  D26D 78DC 68DB 326D 8438



Reply to: