Re: SPAdes for Debian
- To: Anton Korobeynikov <firstname.lastname@example.org>
- Subject: Re: SPAdes for Debian
- From: Andreas Tille <email@example.com>
- Date: Wed, 2 Apr 2014 11:36:55 +0200
- Message-id: <[🔎] 20140402093655.GI13234@an3as.eu>
- In-reply-to: <CA+Ov9+mtnsBPmbZqWsbqJb2N8pKtpt7-e8V2UCh9LCB46yevdA@mail.gmail.com>
- References: <20140217135301.GA10714@an3as.eu> <CAE92sXsS_9SiJxnaGuj99ELKd-44j463rFOB72T0Eg-LKv8=LA@mail.gmail.com> <CA+Ov9+mjTRG-y1fDDo37ABjKTeS5VfPZE6m1oky8ipxq6CYTng@mail.gmail.com> <20140217162019.GF10714@an3as.eu> <CA+Ov9+m3NjDyaQzKLdZMcOJGnx3tw6pfeqKBwZR3f_8trgxmwQ@mail.gmail.com> <20140217212204.GC27707@an3as.eu> <CA+Ov9+mtnsBPmbZqWsbqJb2N8pKtpt7-e8V2UCh9LCB46yevdA@mail.gmail.com>
many thanks for coming back to me about this.
On Wed, Apr 02, 2014 at 01:10:00PM +0400, Anton Korobeynikov wrote:
> Hi Andreas
> Right now we're starting to plan SPAdes 3.1 release, so it may be a
> good time to do some changes.
> What we'll certainly do - move the binaries besides spades.py into
> /usr/share/spades/bin subdir.
I guess you mean /usr/lib/spades/bin (since architecture dependant
binaries belong to /usr/lib rather than /usr/share according to FHS
> Any other suggestions?
In Debian we are adopting more and more the feature to run a test suite
in the build process of the package on one hand and in addition to this
a so called autopkgtest feature is implemented which runs a testsuite
perdiodically (I think at least once a month) to install a package and
its dependencies on a clean machine and run a defined testsuite. This
ensures that it plays nicely with all needed components.
It would be really great if you could provide such a suite which could
be a simple script processing your test data youo are providing anyway
and add a simple comparison mechanism to verify the result. When I
worked on the 3.0.0 package (which unfortunately is not finished yet ...
exactly due to the reason that I stumbled upon some tests and wanted to
do more comparisons) I realised that some output files are containing
dates and time stamps and thus comparisons do not make any sense but
other files need to match. I guess it would be on one hand easy for you
to define a valid test result and it would also be quite helpful for
> PS: SPAdes 3.1 will include bamtools as a dependency / part.
In Debian we are maintaining libbam-dev (from samtools currently at
version 0.1.19-1). We would like to be able to link against this
libraries due the mentioned principle of modularised packaging *and*
testing. I wonder what you might think about this.
BTW, I think it would be very important to not only discuss with me in
person but rather with the Debian Med mailing list
firstname.lastname@example.org . The rationale behind this is that while I
have gathered quite some packaging skills I'm not a biologist and thus
no end user of the packages. The end users reading on the list might
have a different opinion about things which would be great to hear (and
to respect) and you finally are wasting some "advertising" for SPAdes in
a forum of potential users if you are just writing to me.
So if you would give me permission I would bounce this mail to the list
and might do so in the future as well in case you might forget (you do
not necessarily need to be subscribed but please ask for beeing CCed
to receive direct answers to your mail).
> On Tue, Feb 18, 2014 at 1:22 AM, Andreas Tille <email@example.com> wrote:
> > Hi Anton,
> > On Tue, Feb 18, 2014 at 01:00:20AM +0400, Anton Korobeynikov wrote:
> >> > Well, in Debian the according modules are selected automatically
> >> > depending from the Python version. So if you run python3 the Python
> >> > modules path is automatically set to the Python 3 modules (which are in
> >> > the packages python3-yaml and python3-joblib. If you confirm that
> >> > SPAdes works "better" with Python 3 I will adapt the according script in
> >> > /usr/bin/spades to force python3 as interpreter. (The patch will remain
> >> > the same.)
> >> For SPAdes it does not matter whether it runs via python 2.x or 3.x.
> >> So, you have checked and spades.py works after your changed via both
> >> "python spades.py" and "python3 spades.py" ?
> > Currently the Debian package uses a helper called dh_python2 which
> > installes the modules (like support) only into Python 2 module space.
> > If there would be a good reason to move it also / instead into Python 3
> > module space it would work there as well (untested but I'm pretty sure
> > about this - this is how Python 2 and 3 are behaving nicely on Debian
> > systems. If you see any need for this I'm fine with testing it.
> >> > For sure I do not intend to create an incompatibility with any
> >> > non-Debian instances of SPAdes and it is definitely not in my interest
> >> > to create packages you as developers do not like. However, specifically
> >> > in the case of spades.py I feel confused as a user since this seems to
> >> > be basically a wrapper around "spades.c" code.
> >> Well, this is not a lightweight wrapper.
> > Sure. I admit I was provoking a bit. But at some higher level it
> > remains a wrapper for some programs written in C.
> >> Actually, spades.py
> >> implements the whole pipeline - it prepares the configuration files,
> >> creates the directories for intermediate files, runs the tools within
> >> the pipeline, gracefully handles the errors (so even if some of the
> >> tools inside the pipeline crash we can still make sure that the error
> >> is properly reported into log file with appropriate stack trace) and
> >> many other stuff. It's actually a part of SPAdes - the glue layer
> >> between various tools. Check "src/spades_pipeline/*" for the sources
> >> of this "wrapper" script :)
> > Yes, I understand this perfectly. This is acutally the reason why we
> > are using SPAdes ... because it is a full pipeline.
> >> > Could you please
> >> > elaborate about the reasons why you are "advertising" the script
> >> > language in a way which may be needs to reverted in case at some time in
> >> > time you consider rewriting the wrapper into Ruby, Haskell or whatever
> >> > cool language (not that I personally would consider this sensible, just
> >> > finding some examples).
> >> The reasons (by now) are mostly historical. However, we cannot get rid
> >> of ".py" extension now, because it would be incompatible change. This
> >> might be a good change for SPAdes 4.0 though, but I have no idea when
> >> this may happen, maybe within a year or two.
> > Well, I do not mind about the point in time. I just wanted to make you
> > aware about this issue and from my point of view it is totally sensible
> > to not fiddle around with names right after a release. However, I read
> > your statement a bit in a way that you somehow share my (actually not my
> > but the Debian policy editors) point to not use the *.py extension.
> >> > And yes, I did not regarded the developer's way of running SPAdes. The
> >> > Debian package (for the moment) supports only the users who do not
> >> > intend to rebuild from code. Do you think it would make sense to
> >> > support a development library?
> >> No. The development version is intended for SPAdes developers only.
> > OK. Thanks for sorting this out.
> >> > choosing is that we will be able to force python3 via this wrapper if
> >> > you would prefer this. Can you tell me any reason for a user who simply
> >> > wants to do genome assemblies to pick from a set of Python interpreters?
> >> We used to support many python versions simply because there are still
> >> bunch of the users who have 2.4 installed (e.g. via old RedHat /
> >> CentOS installations). And since this may be centralized server
> >> installation, they would be unable to upgrade. So, we had to tune at
> >> our side. Same for 3.x
> > OK. Understand this as well. When creating Debian packages your are
> > somehow looking from the other end: You perfectly know what at the
> > users machine is installed - even better you can *control* via
> > dependencies what is installed. So for instance I could force the use
> > of Python 3 (with a specific version) if there would be any need (even
> > if I understood that practically there is no such need).
> >> > Does this mean there is a test suite included I just not detected? The
> >> > point is that it does not need to be user friendly because we try to
> >> > run it on behalf of our users
> >> There is a testsuite, but it's not included into the release tarballs.
> >> The problem is that in order to carefully test various assembler
> >> features one usually needs quite big input files, etc. You can surely
> >> download the reads from our website (e.g.
> >> http://spades.bioinf.spbau.ru/spades_test_datasets/ecoli_sc/) and try
> >> to assemble to make sure you obtain the same results as mentioned in
> >> the comparison table at http://bioinf.spbau.ru/spades/
> > OK. I will do this in any case and will see in how far I could turn
> > this into a test for the Debian package. I'd be really in favour of
> > combining any scientific package with such a kind of test suite.
> >> >> ext/tools/bwa - GPLv3. Slightly modified to make sure it compiles with
> >> >> modern GCC's.
> >> >
> >> > My plan is to replace this by the Debian packaged version:
> >> >
> >> > $ apt-cache show bwa | grep ^Version
> >> > Version: 0.7.5a-2
> >> >
> >> > which definitely compiles with latest gcc (and we actually provided
> >> > patches to upstream in the past to let this happen). Do you think
> >> > it is a bad idea to rather use bwa 0.7.5a than the code copy you are
> >> > providing inside SPAdes?
> >> I believe it should be fine. Now that we're renaming the binary into
> >> "bwa-spades" in order not to clash with the system one. You may want
> >> to patch this out :)
> > Yes. This was the plan. Thanks for confirming in advance.
> > Its fun to work together with responsive upstream developers like you.
> > Thanks for this
> > Andreas.
> > --
> > http://fam-tille.de
> With best regards, Anton Korobeynikov
> Faculty of Mathematics and Mechanics, Saint Petersburg State University