Re: Idea wanted: What is the most key open source projects to fight COVID-19?
Hi Jun,
thanks a lot for your input which is extremely helpful.
On Sat, May 09, 2020 at 10:10:33PM +0200, Jun Aruga wrote:
> Hi Andreas,
>
> The 3 pipelines nf-core/nanoseq, nf-core/artic, nf-core/viralrecon I
> shared are the most applicable (= the highest priority) to COVID-19
> analysis. [1].
> Now I share the additional 5 pipelines that are certainly relevant (=
> the 2nd highest priority). [2]
> This is the result of my interview in the nf-core Slack channel.
Thanks for doing this.
> Could you check the packages situations in Debian?
Done below. Every package without a remark is in Debian. I gave
upstream URLs for those packages that are not yet touched and URLs on
salsa.debian.org for preliminary packaging stuff. Sometimes I had
some additional comments from previous experience.
> Sorry I knew information about the 5 pipelines a few weeks ago. But I
> took a time to summarize the list of the packages and email it.
That's fine. We found other work to do meanwhile. ;-)
> * nf-core/scrnaseq (single cell)
> * nf-core/smartseq2 (single cell)
> * nf-core/sarek (whole-genome sequencing)
> * nf-core/mag (meta-genomics)
> * nf-core/bcellmagic (immune response)
>
> https://github.com/nf-core/scrnaseq/blob/master/bin/scrape_software_versions.py
> bustools
-> https://bustools.github.io/
> kallisto
> multiqc
> salmon
> star
In Debian named rna-star
> https://github.com/nf-core/smartseq2/blob/master/bin/scrape_software_versions.py
> fastqc
> multiqc
>
> https://github.com/nf-core/sarek/blob/master/bin/scrape_software_versions.py
> allelecount
-> https://github.com/cancerit/alleleCount
> ascat
-> https://github.com/Crick-CancerGenomics/ascat
> bcftools
> bwa
> fastqc
> freebayes
> gatk
-> https://salsa.debian.org/med-team/gatk
> htslib
> manta
-> https://salsa.debian.org/med-team/manta
May be somebody volunteers to ping upstream about a Python3 port here
https://github.com/Illumina/manta/issues/180
We can not get it into Debian before this is solved.
> multiqc
> qualimap
-> https://salsa.debian.org/med-team/qualimap
I once was pretty close to package this. However it contains some binary
JARs named bioinfo-commons-0.10.1.jar, bioinfo-ngs-0.1.0.jar, ...
where the source just seems to be "lost". See my discussion on the mailing list:
https://groups.google.com/forum/#!msg/qualimap/KVJ8m5ZsAhU/O30pHTkADAAJ;context-place=searchin/qualimap/org.bioinfo
If anybody has some idea where to find those sources that would be really
helpful!
> r
> samtools
> snpeff
-> https://salsa.debian.org/med-team/snpeff
> strelka
-> https://github.com/Illumina/strelka/
> tiddit
-> https://github.com/SciLifeLab/TIDDIT
> vcftools
> vep
Please confirm that you mean
https://github.com/Ensembl/ensembl-vep
> https://github.com/nf-core/mag/blob/master/bin/scrape_software_versions.py
> busco
-> https://gitlab.com/ezlab/busco
> cat
-> https://github.com/dutilh/CAT
BTW, its not a good idea to name a tool like a pretty generid UNIX command
> centrifuge
> fastp
> fastqc
> filtlong
> kraken2
> megahit
-> https://ftp-master.debian.org/new/megahit_1.2.9-1.html
(just uploaded to new)
> metabat
-> https://bitbucket.org/berkeleylab/metabat
> multiqc
> nanolyse
-> https://pypi.org/project/NanoLyse/1.1.0/ (with version tag)
-> Asked for release tags on https://github.com/wdecoster/nanolyse/issues/6
> nanoplot
-> https://salsa.debian.org/med-team/nanoplot
> porechop
> quast
-> https://salsa.debian.org/med-team/quast
As I wrote in some other mail this itself has a lot of predepends
that are included as binaries.
> spades
>
> https://github.com/nf-core/bcellmagic/blob/master/bin/scrape_software_versions.py
> changeo
> fastqc
> multiqc
> muscle
> presto
Binary package name is python3-presto
> r
> r-alakazam
Binary package name is r-cran-alakazam
> r-shazam
Binary package name is r-cran-shazam
> r-tigger
Binary package name is r-cran-tigger
> vsearch
If anybody wants to work on one of the packages its probably helpful to
announce it here.
Kind regards and thanks again for your very helpful contribution
Andreas.
> [1] nf-core Slack #covid19 channel
> https://nfcore.slack.com/archives/C0105J0J9T8/p1587480925053900
> > The pipelines you listed are the ones that are/will be most
> applicable to COVID-19 analysis.
> [2] nf-core Slack #covid19 channel
> https://nfcore.slack.com/archives/C0105J0J9T8/p1587498948095200
> > But the single cell pipelines could certainly be relevant. Also
> sarek for whole-genome sequencing analysis, mag for metagenomics
> analysis and bcellmagic for investigations into the immune response..
>
> Thanks & Cheers,
> Jun
>
> --
> Jun | He - His - Him
>
>
--
http://fam-tille.de
Reply to: