[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: blast+ packaging



Tim Booth <avarus@fastmail.fm> writes:

> 1) This may have been down to our rubbish wireless link but there
> appeared to be something stopping automated downloads (ie. Uscan) from
> NCBI.  I know they do have anti-leeching on some of their sites.  Do you
> get any problems?

This shouldn't be a problem.  How does your watch file read, and what
errors do you get?

> 2) The current blast2 package has a version that tallies with the blast+
> version - ie. 1:2.2.24.20100808-2, yet the blast2 package doesn't
> contain Blast+ and I can't see where this version is coming from in
> blast2 given that it is built from the NCBI C toolkit which is versioned
> by date.  I started to look into this and ran out of time - any idea?

BLAST+ comes from NCBI's C++ Toolkit, but the version numbers line up
because they share the same underlying engine (written in C without any
dependencies on code specific to either Toolkit).

> 3) There is a handy script called legacy_blast.pl that emulates blastall
> and thus allows BLAST+ to be used with tools like T-Coffee.  I can't
> remember if this is in the upstream tarball or not, but if so it might
> be worth using the alternatives system to allow BLAST+ to fill in for
> legacy BLAST.

It is present in the upstream tarball, and that would be a reasonable
use of the alternatives system, which I'd be happy to accommodate from
the C side.  Another option would be to ship the symlinks to
legacy_blast.pl in a separate package that would provide, conflict with,
and replace blast2.

> 4) The BLAST+ binaries, if downloaded pre-compiled from NCBI, come in at
> a whopping great size compared to the source code.  I was meaning to
> look into what was going on (muchos static linking??) but never got
> around to it.

The precompiled binaries are statically linked C++ code, hence huge. ;-)
The C++ Toolkit's build system moreover defaults to producing extra-huge
debugging-oriented executables, but that doesn't affect the distributed
binaries, which arrange to use different options.

> 5) There should be a default $BLASTDB directory, I think.  Can't
> remember what the Debian policy is on apps that need a certain
> environment set before they will run but I'm sure the basic idea is to
> try and set defaults so the app will run out-of-the-box.

ncbi-data (on which you'll probably want to depend anyway) ships an
/etc/ncbi/.ncbirc to which I could trivially add a [BLAST] BLASTDB
setting.

> Anyway, I gather BLAST+ should be less of a beast to package then the
> original, so have fun.

It's differently beastly: the build system is far less idiosyncratic,
but the tree is much bigger, and the full C++ Toolkit features other
major applications (Genome Workbench and Cn3D++) that not only have
independent release cycles but also come from different upstream
branches.  None of that's insurmountable; I just haven't had time to
work on packaging any NCBI C++ code.

> I'm not very good at reading the list so if you are able to CC me on
> any messages that would be appreciated.

Likewise, as Andreas noted.  Thanks!

-- 
Aaron M. Ucko, KB1CJC (amu at alum.mit.edu, ucko at debian.org)
http://www.mit.edu/~amu/ | http://stuff.mit.edu/cgi/finger/?amu@monk.mit.edu


Reply to: