[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Import of debian med packages metadata to Tools Platform Ecosystem



On 25.07.20 22:49, Hervé Ménager wrote:
> Hi Steffen, Andreas,
>
> - Thanks for the clarification regarding the mirror. I thought there
> was a mirror because I read this in the edam.sh comment:
> "
> This script lives on
>
> https://salsa.debian.org/blends-team/website/commits/master/misc/sql/edam.sh
> <https://salsa.debian.org/blends-team/website/commits/master/misc/sql/edam.sh>
>
>
This is where Andreas initially commited to.
>
> and a redundant copy is held on
>
> https://github.com/bio-tools/biotoolsConnect.git/DebianMed/edam.sh
> <https://github.com/bio-tools/biotoolsConnect.git/DebianMed/edam.sh>
> "
This is where Matúš and I fiddled with it and the initial site got
everything back.
>
> - regarding the fact of reusing the output of edam.sh rather than a
> single python script, I wholeheartedly agree. I just focused at first
> on a minimal working workflow, but time permitting, I'd like to clean
> up a bit and have a single python tool
Yip. And at the 2019 Sprint we came up with an almost working web
service to translate between bio.tools and Debian med package names.
This should now depend on bio.tools directly, right?
>
> - regarding EDAM annotations, one source you could possibly use is
> bio.tools, for the packages which are already cross-linked between
> debian and bio.tools.

Ok - then I suggest to

 a) complete the mapping from Debian to bio.tools for the bioinformatics
packages on that Excel sheet (state "n" that I mention below)
 b) add d/u/edam files for these packages, maybe autogenerate them from
bio.tools if not already existing? We need to mention that in d/copyright.
 c) see how this can be fed back to bio.tools.

>
> Anyway, thanks a lot for your help, I am so very happy this is now up
> and running! We should set up a plan to continue this work!

m) I first looked on bio.tools for "blast", was unhappy with the result,
then "ncbi blast", still unhappy.

n) Ok, then "bowtie", which is what the debian/upstream/edam effort once
started with and found https://bio.tools/bowtie2 with a reference to the
Debian package on the lower-right. It is actually the first time I ever
saw that but you likely have that for a while already and I was just
unaware of it right?https://bio.tools/clustalo also worked. Nice.

o) Infernal is a bit broken on the bio.tools side since it only refers
to one of its executables.

p) hmmer3 (https://bio.tools/hmmer3) does not have a ref to Debian.

q) I then searched for "Debian" on bio.tools and found four entries.
https://bio.tools/DNCON2 hat Debian in its description. The other three
have a "DebianMed" collection tag. I was not aware of that.

My personal ambition for bio.tools is that it substitutes most of what
the Debian Med task pages are doing at the moment. But that is just what
keeps me going, let's ignore this for now, just for the back of our minds.

Concerning m-q - I happen to have stumbled across 5 very different
problems/challenges. Only for packages that have reached the state "n" I
think we shall proceed with edam annotation in Debian so we have a
feedback loop. What would be the right thing to do for a Debian package
maintainer / user who encounters a package in state m, o, p or q? This
is an issue on github, I presume, but maybe you could come up with a bit
more of an explicit instruction for the Debian folks how to report and
where to make it all as easy as possible for bio.tools?

Do you have opinions on a-c?

Best,

Steffen


> n Sat, Jul 25, 2020 at 5:15 PM Steffen Möller <steffen_moeller@gmx.de
> <mailto:steffen_moeller@gmx.de>> wrote:
>
>     This is great news, thank you for this, Hervé.
>
>     For now I think the most important bit is to have anything that is
>     automated in some reasonable way. And then let's extend that over
>     time.
>     This should give edam annotation a particular boost, I tend to think.
>
>     You may be aware of an earlier email in which I described the Google
>     Spreadsheet
>     https://docs.google.com/spreadsheets/d/1tApLhVqxRZ2VOuMH_aPUgFENQJfbLlB_PFH_Ah_q7hM/edit?usp=sharing
>     <https://docs.google.com/spreadsheets/d/1tApLhVqxRZ2VOuMH_aPUgFENQJfbLlB_PFH_Ah_q7hM/edit?usp=sharing>
>     to provide an overview on what packages (rows) are important for which
>     workflows (columns).
>
>     Now imagine what EDAM could do for us along those lines. @Jon,Matúš,
>     should we possibly do some bulk EDAM annotation just for the
>     packages on
>     this "virus" tab? I suggest several iterations, like Topics first to
>     provide the summary scope and I/O+File formats second for the
>     individual
>     executables once we know more about it all.
>
>     Best,
>
>     Steffen
>
>
>     On 25.07.20 16:33, Andreas Tille wrote:
>     > Hi Hervé
>     >
>     > On Sat, Jul 25, 2020 at 12:43:08AM +0200, Hervé Ménager wrote:
>     >> Hello Steffen, everyone,
>     >> This is to let you know I am currently attending the BCC
>     virtual hackathon (
>     >> https://bcc2020.github.io/cofest/
>     <https://bcc2020.github.io/cofest/>). With a couple of other
>     people, I have
>     >> been working on the Tools Platform Ecosystem (
>     >> https://github.com/bio-tools/content/
>     <https://github.com/bio-tools/content/>), continuing among other
>     things the
>     >> work we have been doing with you and others over the years to
>     integrate
>     >> metadata from Debian Med packages and bio.tools (among others).
>     >> I have finalized a first working version of this work, but I
>     took the
>     >> liberty to modify the biotoolsConnect repository, esp. the
>     script you have
>     >> created Steffen, to correct a few things. I did so on github (
>     >> https://github.com/bio-tools/biotoolsConnect
>     <https://github.com/bio-tools/biotoolsConnect>) but I know this is
>     a mirror
>     >> to a repository on Salsa.
>     > I do not think there is a mirror on Salsa.  I've just developed
>     edam.sh
>     > there as a *simple example*.  I admit I'm a bit astonished that
>     > edamJson2biotools.py is parsing the output of this *example* script
>     > instead of using the query directly inside the Python script.
>     >
>     >> If you need me to do something to synchronize
>     >> with this other repository, please tell me (and tell me how).
>     > I think there is no need to syncronise.  Its not used there.
>     >
>     >> Thanks a lot for your hard work on this subject.
>     > You are welcome and thanks for your support
>     >
>     >      Andreas.
>     >
>


Reply to: