[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: data sets and/or access to data sets



Hi Scott,

I think your idea is quite reasonable in principle.  As far as I
understood (but I did not dived into this) the getData effort[1] is one
step into this direction and the to be soon uploaded package Biomaj does
something that might be helpful as well.

Regarding to actually buold packages:  There were several ideas in the
past to have some data.debian.org archive which contains large data sets
where the packages you would suggest probably would fit into.  However,
to the best of my Knowledge this was not yet implemented for practical
use.

Do we want to try another shot onto a Google Summer of Code project
into this direction?

Kind regards

    Andreas.

[1] http://wiki.debian.org/getData

On Tue, Feb 15, 2011 at 02:24:46PM -0600, Scott Christley wrote:
> Hello,
> 
> I wonder if anybody has thought about providing large data sets, like genomes, microarray data, etc. into debian "packages" in a way that makes it easy for users to get those data sets onto their machine, making it easier to use various tools?  I can think of many great ways this would be useful.
> 
> For example, If a user had high-throughput sequencing data that they need to align to a genome.  Now there is a tool available in debian called bowtie that will do the job but the user needs to 1) download the genome and 2) generate the bowtie index.  Wouldn't it be great if you just type:
> 
> apt-get install bowtie-human-genome-index
> 
> which installed the genome and the pre-built indexes, then they could just run bowtie directly.
> 
> Or another example is wanting to do your own BLAST searches, why not a package that has the BLAST database indexes:
> 
> apt-get install BLAST-human-genome
> 
> What is nice is all of these data sets could be maintained in a global directory space, like /usr/share, so all of the tools could share this space preventing duplication, and made available to all users on the system.  Right now every user has to figure out how to manage their data individually which can be difficult for biologists.
> 
> What do you think?
> 
> Scott
> 
> 
> -- 
> To UNSUBSCRIBE, email to debian-med-REQUEST@lists.debian.org
> with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
> Archive: [🔎] DE3C6125-DCE5-4414-8D0C-41DF50764F07@mac.com">http://lists.debian.org/[🔎] DE3C6125-DCE5-4414-8D0C-41DF50764F07@mac.com
> 
> 

-- 
http://fam-tille.de


Reply to: