[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Idea wanted: What is the most key open source projects to fight COVID-19?




On 11.05.20 22:23, Andreas Tille wrote:
Hi Jun,

On Mon, May 11, 2020 at 06:33:15PM +0200, Jun Aruga wrote:
Many thanks for updating the spread sheet adding the Debian and
Bio.tools status!
You are welcome.

I added the following 2 columns. It's great if we can fill it if we have a time.

* Deb in Debian? arm64
* Deb in Debian? ppc64le
I admit I'm not really motivated to fill these columns manually.  I've
rather drafted a small script that you can run:

    https://salsa.debian.org/blends-team/med/-/blob/master/covid-19_doc/bio_covid-19_dependencies_query

The current result can be found here

    https://salsa.debian.org/blends-team/med/-/blob/master/covid-19_doc/bio_covid-19_dependencies_result

I was hoping you would come up with something like this :o)

@Jun, there is a database at udd.debian.org describing the packages in
the distribution - including references to Conda and bio.tools.



Because I would like to know the status supporting arm64 (aarch64) and
ppc64le for the packages in Debian.
I think UDD gives all answers you are interested in and this answer might
change over time.

The nf-core pipelines for COVID-19 analysis can be used in HPC (super
computing) mainly in my understanding.
And seeing the actual market share of HPC [1], there are not only
Intel based HPC but also Power9 (ppc64le) and arm64 based HPC.

But currently some of the bio tools only support or enable amd64
(x86_64 intel) CPU.
So, enabling the pipelines on ppc64le and arm64, connects to maximize
the HPC resources for COVID-19 analysis.

I talked the motivation to people in nf-core project, and they are
interested in it.
Currently nf-core pipeline projects only support and have the amd64
based docker containers.
I hope that the query result is helpful.  I also checked bustools as
example that was pinned to amd64 architecture.  Since I have not seen
any reason I simply uploaded for any architecture - lets see what we get
out of this.

BTW, in the list I replaced

    "20 dependencies on R packages for pigx-rnaseq would need to go here"

by those dependencies I've found in the preliminary pigx-rnaseq packaging
by the real r-* packages.  Do you intend to add pigx-rnaseq to the list
itself?  Since we do not have anything for pigx-scrnaseq I ignored this
for the moment.

I presume this depends a bit on how well the packaging of nextflow is going.

I added it for that technical alternative - remaining testing issues
aside Debian has it all. And tools shared further help with the
interpretation what a workflow is doing, as in "where it is special".

I want to improve the situation contributing to nf-core project eventually.
I guess we all have a common interest. :-)

cat
    -> https://github.com/dutilh/CAT
    BTW, its not a good idea to name a tool like a pretty generid UNIX command
By the way, Steffen,
Just keep in mind. "cat" is not the regular UNIX command's "cat", but
"CAT", as Andreas mentioned it.
But its a proof that my remark about some unfortunate choice of the name
that Steffen misunderstood the package request.

Just to truly stress the importance to not name a tool after an existing
UNIX command, I cordially insist on changing this back to the UNIX cat.

It is just upstream concatenating fastq files and they are doing this
here
(https://github.com/nf-core/viralrecon/blob/d40b060cdaeb69ddeca34c332c5108f59e9c6465/main.nf#L625)
with a regular cat. You will also note that CAT is not listed on
https://github.com/nf-core/viralrecon/blob/dev/environment.yml. And they
give a ref to the UNIX tool in the readme.

So, while I agree that we should package CAT, it looks like they indeed
meant that good ole four-legged one.

Meows, chirps, yowls, and purrs

Steffen


Reply to: