Re: CuteSV (Was: PyEnsembl - how does that help us?)

To: debian-med@lists.debian.org
Subject: Re: CuteSV (Was: PyEnsembl - how does that help us?)
From: Steffen Möller <steffen_moeller@gmx.de>
Date: Sun, 23 May 2021 18:16:42 +0200
Message-id: <[🔎] 94a48ed2-b557-48c2-4517-00903a5dd4d1@gmx.de>
In-reply-to: <[🔎] 4525a2b1-611d-0877-9863-2318720827dd@gmx.de>
References: <[🔎] 58d83e85-81e1-f6a2-da0a-b7eaff743d9e@gmx.de> <[🔎] CAJN1928EZRKmW-EOfaE8ejgLf-Cv11ux4mx_asZgYqkPX7pZMA@mail.gmail.com> <[🔎] 4b46c10b-d952-f37f-b113-25321d6efc06@gmx.de> <[🔎] 25d02178-979e-da62-2a74-9d6e40feaa5d@nileshpatra.info> <[🔎] 4382f7b3-9ce1-924b-0d7e-901a5feeb9c0@gmx.de> <[🔎] d523d0d0-a272-5c09-e71b-0dd1bed243ea@nileshpatra.info> <[🔎] db540cba-c2a8-8d45-fd48-9e321ded5945@gmx.de> <[🔎] af1d274a-a0db-6e58-8ca8-c20b55d3e5cc@nileshpatra.info> <[🔎] 243e5b57-1c3e-8432-46bf-1a34381b3370@gmx.de> <[🔎] 20210522071046.GC8962@an3as.eu> <[🔎] 20210522212431.GF8962@an3as.eu> <[🔎] fc9a8fa8-dc98-f422-210d-6dc2e1a93d75@nileshpatra.info> <[🔎] 736ba102-768c-11e0-e459-88636722b418@gmx.de> <[🔎] 4525a2b1-611d-0877-9863-2318720827dd@gmx.de>

https://salsa.debian.org/med-team/catfishq
is ready for review+sponsoring.
Many thanks!
Steffen

Am 23.05.21 um 16:18 schrieb Steffen Möller:

Am 23.05.21 um 14:26 schrieb Steffen Möller:

Am 23.05.21 um 00:02 schrieb Nilesh Patra:

On 5/23/21 2:54 AM, Andreas Tille wrote:

On Sat, May 22, 2021 at 09:10:46AM +0200, Andreas Tille wrote:

On Fri, May 21, 2021 at 09:26:48PM +0200, Steffen Möller wrote:

If someone needs a stimulus to package something - cuteSV
(https://github.com/tjiangHIT/cuteSV), please.

I gave it a kickstart while sitting in the train (which will be
offline soon).  Everybody can feel free to add own ID to Uploaders
and finalise.  There is no build time test running now and no
autopkgtest.  Data to test / benchmark are included - so this
should be feasible.

I just packaged the precondition python3-cigar and uploaded to new.

I wrote a sample autopkgtest for cigar (basically used the same thingy in the readme)
and did a few minor changes.

I have no idea about autopkgtests for cutesv - I lack the pre-requistites here and probably only Steffen can help here.

PS: Please check and upload vbz-compression whenever you have time (after two days as you wrote would be fine anyway)
I'll be inactive/be away for a couple of days (wish to take a break :-))

Thank you both, you are amazing!

CuteSV is part of the
https://github.com/nanoporetech/pipeline-structural-variation that I
plan to run when first Nanopore reads surface in my inbox next week. You
compare against a reference genome to run this, which we do not have in
Debian, so, yes, we should think of some tests, but we should also find
a way to perform such tests for other packages.

This kind of leads to a follow-up question - we could have a "test
package" that offers a fraction of the human genome, like the Y
chromosome and a second - chromosome 22 maybe. That would not be too big
and we can test with it. It would also be a bit meaningless, though. And
for testing we do not need anything to be human (or real) in the first
place. We could generate our own mini-genome or instead (which I would
prefer) go for something small that is real, like yeast (for
eukaryotes), E. coli (for bacteria), we ignore archea, and then .. there
is https://www.ncbi.nlm.nih.gov/nuccore/CP014940 , i.e. that data fr C.
Venter's
https://www.jcvi.org/research/first-minimal-synthetic-bacterial-cell,
which may be interesting to be distributed with an Open Source
distribution.

While there is always something novel found also for these genomes for
which the genomic DNA is long known, we do not much harm by distributing
such genomes. Professional researchers will update them, anyway. The
same holds for the human genome, but it is a bit larger and we should
possibly make our experiences with the smaller genomes, first.

I'll let this think in for another while and then likely extend getData
to deal with these genomes and auto-generate native Debian packages with it.

Ok - back to some real work and I'll have a closer look at that pipeline.

I just went through their snakemakefile. To get this running, we need

* catfishq https://github.com/philres/catfishq
* lra (long read aligner) https://github.com/ChaissonLab/LRA
* truvari https://github.com/spiralgenetics/truvari/
* add the scripts to libvcflib1/new package vcflib-scripts

Catfishq looks straight-forward, I'll just go and adress that. LRA is a meson build with "subprojects" that wrap other bits. Truvari drags in a few python packages that in part we do not have, yet . Have added that info to the Nanopore tab on https://docs.google.com/spreadsheets/d/1tApLhVqxRZ2VOuMH_aPUgFENQJfbLlB_PFH_Ah_q7hM/edit#gid=1806578173

Best,
Steffen

Reply to:

Follow-Ups:
- Re: CuteSV (Was: PyEnsembl - how does that help us?)
  - From: tony mancill <tmancill@debian.org>

References:
- Re: PyEnsembl - how does that help us?
  - From: Steffen Möller <steffen_moeller@gmx.de>
- Re: PyEnsembl - how does that help us?
  - From: Nilesh Patra <nilesh@debian.org>
- Re: PyEnsembl - how does that help us?
  - From: Steffen Möller <steffen_moeller@gmx.de>
- Re: PyEnsembl - how does that help us?
  - From: Nilesh Patra <nilesh@nileshpatra.info>
- Re: PyEnsembl - how does that help us?
  - From: Steffen Möller <steffen_moeller@gmx.de>
- Re: PyEnsembl - how does that help us?
  - From: Nilesh Patra <nilesh@nileshpatra.info>
- Re: PyEnsembl - how does that help us?
  - From: Steffen Möller <steffen_moeller@gmx.de>
- Re: PyEnsembl - how does that help us?
  - From: Nilesh Patra <nilesh@nileshpatra.info>
- Re: PyEnsembl - how does that help us?
  - From: Steffen Möller <steffen_moeller@gmx.de>
- CuteSV (Was: PyEnsembl - how does that help us?)
  - From: Andreas Tille <andreas@an3as.eu>
- Re: CuteSV (Was: PyEnsembl - how does that help us?)
  - From: Andreas Tille <andreas@an3as.eu>
- Re: CuteSV (Was: PyEnsembl - how does that help us?)
  - From: Nilesh Patra <nilesh@nileshpatra.info>
- Re: CuteSV (Was: PyEnsembl - how does that help us?)
  - From: Steffen Möller <steffen_moeller@gmx.de>
- Re: CuteSV (Was: PyEnsembl - how does that help us?)
  - From: Steffen Möller <steffen_moeller@gmx.de>

Prev by Date: Re: vcflib does not install scripts - missing bgziptabix
Next by Date: Re: CuteSV (Was: PyEnsembl - how does that help us?)
Previous by thread: Re: CuteSV (Was: PyEnsembl - how does that help us?)
Next by thread: Re: CuteSV (Was: PyEnsembl - how does that help us?)
Index(es):
- Date
- Thread