Package: wnpp
Severity: wishlist
* Package name : express
Version : 1.5.1
* License : Artistic-2.0
Programming Lang: C++
Description : Streaming quantification for high-throughput sequencing
eXpress is a streaming tool for quantifying the abundances of a set of
target sequences from sampled subsequences. Example applications include
transcript-level RNA-Seq quantification, allele-specific/haplotype
_expression_ analysis (from RNA-Seq), transcription factor binding
quantification in ChIP-Seq, and analysis of metagenomic data. It is
based on an online-EM algorithm that results in space (memory)
requirements proportional to the total size of the target sequences and
time requirements that are proportional to the number of sampled
fragments. Thus, in applications such as RNA-Seq, eXpress can accurately
quantify much larger samples than other currently available tools
greatly reducing computing infrastructure requirements. eXpress can be
used to build lightweight high-throughput sequencing processing
pipelines when coupled with a streaming aligner (such as Bowtie), as
output can be piped directly into eXpress, effectively eliminating the
need to store read alignments in memory or on disk.
.
In an analysis of
the performance of eXpress for RNA-Seq data, we have observed that this
efficiency does not come at a cost of accuracy. eXpress is more accurate
than other available tools, even when limited to smaller datasets that
do not require such efficiency. Moreover, like the Cufflinks program,
eXpress can be used to estimate transcript abundances in multi-isoform
genes. eXpress is also able to resolve multi-mappings of reads across
gene families, and does not require a reference genome so that it can be
used in conjunction with de novo assemblers such as Trinity, Oases, or
Trans-ABySS. The underlying model is based on previously described
probabilistic models developed for RNA-Seq but is applicable to other
settings where target sequences are sampled, and includes parameters for
fragment length distributions, errors in reads, and sequence-specific
fragment bias.
.
eXpress can be used to resolve ambiguous mappings in other
high-throughput sequencing based applications. The only required inputs
to eXpress are a set of target sequences and a set of sequenced
fragments multiply-aligned to them. While these target sequences will
often be gene isoforms, they need not be. Haplotypes can be used as the
reference for allele-specific _expression_ analysis, binding regions for
ChIP-Seq, or target genomes in metagenomics experiments. eXpress is
useful in any analysis where reads multi-map to sequences that differ in
abundance.
Express is a dependency of trinityrnaseq. The Debian Med team will be group
maintaining it.