[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#652031: Add more descriptive info on what is clustalo package



Clustal-Omega is a general purpose multiple sequence alignment (MSA)
program for proteins. It produces high quality MSAs and is capable of
handling data-sets of hundreds of thousands of sequences in reasonable
time.

In default mode, users give a file of sequences to be aligned and
these are clustered to produce a guide tree and this is used to guide
a "progressive alignment" of the sequences.  There are also facilities
for aligning existing alignments to each other, aligning a sequence to
an alignment and for using a hidden Markov model (HMM) to help guide
an alignment of new sequences that are homologous to the sequences
used to make the HMM.  This latter procedure is referred to as
"external profile alignment" or EPA.

Clustal-Omega uses HMMs for the alignment engine, based on the HHalign
package from Johannes Soeding [1]. Guide trees are made using an
enhanced version of mBed [2] which can cluster very large numbers of
sequences in O(N*log(N)) time. Multiple alignment then proceeds by
aligning larger and larger alignments using HHalign, following the
clustering given by the guide tree.

In its current form Clustal-Omega can only align protein sequences but
not DNA/RNA sequences. It is envisioned that DNA/RNA will become
available in a future version.
-- 


gpg key id: 4096R/326D8438  (pgp.mit.edu)
Key fingerprint = 5FB4 6F83 D3B9 5204 6335  D26D 78DC 68DB 326D 8438


Reply to: