Re: blast+ packaging
Tim Booth <avarus@fastmail.fm> writes:
> 1) This may have been down to our rubbish wireless link but there
> appeared to be something stopping automated downloads (ie. Uscan) from
> NCBI. I know they do have anti-leeching on some of their sites. Do you
> get any problems?
This shouldn't be a problem. How does your watch file read, and what
errors do you get?
> 2) The current blast2 package has a version that tallies with the blast+
> version - ie. 1:2.2.24.20100808-2, yet the blast2 package doesn't
> contain Blast+ and I can't see where this version is coming from in
> blast2 given that it is built from the NCBI C toolkit which is versioned
> by date. I started to look into this and ran out of time - any idea?
BLAST+ comes from NCBI's C++ Toolkit, but the version numbers line up
because they share the same underlying engine (written in C without any
dependencies on code specific to either Toolkit).
> 3) There is a handy script called legacy_blast.pl that emulates blastall
> and thus allows BLAST+ to be used with tools like T-Coffee. I can't
> remember if this is in the upstream tarball or not, but if so it might
> be worth using the alternatives system to allow BLAST+ to fill in for
> legacy BLAST.
It is present in the upstream tarball, and that would be a reasonable
use of the alternatives system, which I'd be happy to accommodate from
the C side. Another option would be to ship the symlinks to
legacy_blast.pl in a separate package that would provide, conflict with,
and replace blast2.
> 4) The BLAST+ binaries, if downloaded pre-compiled from NCBI, come in at
> a whopping great size compared to the source code. I was meaning to
> look into what was going on (muchos static linking??) but never got
> around to it.
The precompiled binaries are statically linked C++ code, hence huge. ;-)
The C++ Toolkit's build system moreover defaults to producing extra-huge
debugging-oriented executables, but that doesn't affect the distributed
binaries, which arrange to use different options.
> 5) There should be a default $BLASTDB directory, I think. Can't
> remember what the Debian policy is on apps that need a certain
> environment set before they will run but I'm sure the basic idea is to
> try and set defaults so the app will run out-of-the-box.
ncbi-data (on which you'll probably want to depend anyway) ships an
/etc/ncbi/.ncbirc to which I could trivially add a [BLAST] BLASTDB
setting.
> Anyway, I gather BLAST+ should be less of a beast to package then the
> original, so have fun.
It's differently beastly: the build system is far less idiosyncratic,
but the tree is much bigger, and the full C++ Toolkit features other
major applications (Genome Workbench and Cn3D++) that not only have
independent release cycles but also come from different upstream
branches. None of that's insurmountable; I just haven't had time to
work on packaging any NCBI C++ code.
> I'm not very good at reading the list so if you are able to CC me on
> any messages that would be appreciated.
Likewise, as Andreas noted. Thanks!
--
Aaron M. Ucko, KB1CJC (amu at alum.mit.edu, ucko at debian.org)
http://www.mit.edu/~amu/ | http://stuff.mit.edu/cgi/finger/?amu@monk.mit.edu
Reply to: