[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: /usr/bin/picard Re: bcbio will need another while - needs gatk



Hi Andreas,

On 14.11.20 09:43, Andreas Tille wrote:
> On Fri, Nov 13, 2020 at 10:25:39PM +0100, Steffen Möller wrote:
>> I installed a binary distribution of gatk and then ran a bit into
>> trouble over
>>
>> $ apt-file search /usr/bin/picard
>> picard: /usr/bin/picard
>> picard-tools: /usr/bin/picard-tools
>>
>> I tend to think that this exposes a weakness of Debian - it misses
>> namespaces. Others may say that it is a strength. Don't think so. You
>> want to use the same scripts that your scientific peer uses. Everything
>> else goes under "the distribution has patches, the work may not be
>> completely reproducible" - because a log file has a different hash value
>> since executions of /usr/bin/picard were changed to /usr/bin/picard-tools.
> We have the workaround
>
>      /usr/lib/debian-med/bin/
>
> see for instance in eigensoft[1] and lots of other packages.  Simply
> provide this for picard-tools and make sure gatk users set the PATH
> accordingly.

Ah - that is good. Seems like I should read our policy document again.

I just checked what my installation has in this directory ... and it
seems like I spotted a typo (or a creative fix of a typo elsewhere)

$ ls -l /usr/lib/debian-med/bin/
total 0
lrwxrwxrwx 1 root root 26 Nov 12 16:57 tranlate ->
../../../bin/fsa-translate

$ apt-file search tranlate
fsa: /usr/lib/debian-med/bin/tranlate
$ apt-file search /usr/bin/translate
drslib: /usr/bin/translate_cmip3
openafs-client: /usr/bin/translate_et
translate: /usr/bin/translate

I presume you would be happy for me to put there a link from cnvkit.py
(like bcbio expects it) to /usr/bin/cnvkit, right?


>
>> Should we have someone among us for who packaging go-based software
>> feels like easy then the newly surfaced https://github.com/brentp/gsort
>> would be nice.
> Nilesh did so several times.
Wow! @Nilesh, please have a look.
>> And - qualimap. The jar files are still missing that we need to build that.
> Simply ping the authors about this.  As far as I remember the last
> status was that sources are lost (?????) and a backup needs to be found.

And if we just go for a non-free binary package for those jars? Just to
get somewhere?

bcbio is in contrib anyway because of vienna-rna. And I do not think
these .jar files
will find much future adoption without a source backing, so this problem
will eradicate
itself.

There is also
bcbio.pipeline.config_utils.CmdNotFound: '_get_program_cmd' 'snpEff'
{'dir': '/usr/local/share/java/snpeff', 'jvm_opts': ['-Xms750m',
'-Xmx3g']} None

>> Also need to look at how picard is invoked since somehow this does not
>> seem compatible - or it is just because of version differences.
> See above for picard-tools.  You might like to try this first.

I needed to educate myself a bit on picard. There is also a new upstream
version, which kind of fits. As it is now, I got the error message
"'picard' is not a valid command. See PicardCommandLine -h for more
information"
which is coming from picard even though bcbio is executing just that.
This may be another twist that conda may have introduced?
I have just edited the wrapper and installed the link, but now run into
a confusion with Java options that bcbio mixes in
E                   '-Xms750m' is not a valid command. See
PicardCommandLine -h for more information.

as in
picard -Xms750m -Xmx2000m -XX:+UseSerialGC
MergeV...cbio/tests/data/variants/S1-variants-snp.vcf.gz
I=bcbio/tests/data/variants/S1-variants-indel.vcf.gz

Update: I fixed that for the new upstream version 2.23.8 but would not
mind an extra pair of eyeballs prior to an upload over what I did to
debian/bin/PicardCommandLine, even though this now passed bcbio's test.


Should anyone feel like it - you see different tests of bcbio listed like
$ cat tests/pytest.ini |head -n 10
# content of pytest.ini
[pytest]
markers =
    cancer: cancer variant calling pipeline
    cancermulti: cancer variant calling pipeline with multiple callers
    cancerpanel: cancer variant calling pipeline on panels
    cancerprecall: cancer pipeline with pre-called variants
    combo: not sure what this test is for
    devel: test for unsupported features still in development
    ensemble: test variant calling with multiple callers

$ wc -l tests/pytest.ini
53 tests/pytest.ini

bcbio downloads extra data for testing, so this all needs some extra
work, still.

Somehow Debian does barely complete any of these tests. There is always
something missing. And if it is tophat that upstream had asked us not to
support any more.

It is a bit discouraging that we are not covering more of bcbio, yet.
But then again, it is a real-world usability for us, too. And we do not
need to cover _everything_ of bcbio to make some noise about it and
allow it to proceed to testing. Just some well-defined workflow would
suffice.

Best,

Steffen



Reply to: