[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: OMICtools of any use?



Hi David,

On 13.12.18 13:29, Andreas Tille wrote:
Hi David,

On Thu, Dec 13, 2018 at 12:15:11PM +0000, Carnë Draug wrote:
On Thu, 13 Dec 2018 at 08:11, Andreas Tille <andreas@fam-tille.de> wrote:
I noticed that you reverted a commit by Steffen Moeller in imagej adding
an OMICtools identifyer.  For the moment I do not think it is nice to
simply remove the work of fellow DDs without a consensus how to deal
with these data - thus I reverted that remove for the moment.
Please revert it again.  I did not remove it because I'm disliking
omics.  I reverted it because it's wrong.  I did it the first time
during the summer:

     https://salsa.debian.org/med-team/imagej/commit/415ff687c5

But it was added again.  I removed it yesterday for the same reason:

     https://salsa.debian.org/med-team/imagej/commit/a40be89995

In both cases I have explained on the commit message why it was wrong.
Uhmmm, sorry.  I should have read the full commit message.  I just have
read your e-mail here and have seen your last commit.  Sorry for the
noise.  I've now droped a Comment inside the YAML file (and I cross
fingers that my importer code is robust enough to not stumble about it
;-) ).

The OMICtools entry is about all versions. Just have a look at the references to the literature they give. You can argue that it should have two entries for two major versions. I don't see the need for that, I must admit. In my reading, the assignment was/would be just fine. The inaccuracies is not our's, it is OMICtools. And many, me included, in this case regard it as a feature.

Apologies for having readded the OMICtools ref, if I have, no idea. And I don't want to look back. I would nonetheless appreciate if you have some sleep over it and possibly decide to add it back in. :o)


In general I do not see any need to remove these data.  We should try to
find some consensus how to deal with this situation.  Once we have this
consensus we can simply switch of the display of the OMICtools links on
our tasks pages (which is the only use I'm aware of) or even do not
import it into UDD.
They are in the UDD.
This will effectively solve the problem you
mentioned without wasting the work of some team mates who have spent
hours to gather the data.  Simply assume OMICtools might change their
policy.  Do you want to re-add all the data to the packaging
information?

My own position to the thing is:

    1. We should talk to OMICtools people (a good chance might be the
       Debian Med sprint)
We have. And to the ones of bio.tools who even hosted our 2014 sprint.
    2. There are other kind of non-free data (we are linking to
       publications and some of these are hidden behind a pay-wall)
       However, all information we provide can be gathered without
       paying and I would consider the pure IDs as free data.
    3. We try to build a system that is valuable for our users
       including users who are willing to pay for some service provided
       by third party.

Yeah, but there's still a cost of maintaining the metadata.  And one
can't maintain it properly if one can't access the data to check.  For
example, yesterday I couldn't check if the ID on omics was correct.  I
knew it was incorrect because it was the same that I had removed
earlier in the year.  But if this had happened a few months ago, when
I had to check it for the first time, the package would still be with
incorrect metadata.

Anyway, I found out that the data in the omics platform is not that
freely available.  And takes time to gather that data and fill the
metadata files on debian packages.  That's basically what I wanted to
pass on the first email.  I'm not making a call to remove the data,
just passing on the message and people can decide if they want to
spend time on it.

That is exactly how it works. It is all optional.

To answer your original question about the use:

 * registries have a way to link back to Debian from their entries, increasing the awareness of a software and its availability

 * identification of a software across platforms - which works also without the registry itself, e.g. in recent journal publications

I personally like browsing through the registries a lot. Very educative. How exactly these identifiers will be used yet nobody really knows, I think. Workflows one hears everywhere. Descriptions of containers/cloud images come to mind. The data or the infrastructure made public? bio.tools was just released with an Open Source license - let's see how this all develops.

Our task pages are not completely dissimilar to what the registries are offering. And if we want to add features to those then eventually we end up with what these registry folks are trying to establish today. So I see these domain-specific registries as a glimpse of the future of how we could have portals to our distribution.

To me it is all an experiment. We have three (and more) competing initiatives: SciCrunch, bio.tools and OMICtools. Let us see what ideas they come up with. The time for us to annotate d/u/metadata is negligible compared with the effort to mimic what these three already have. Just enjoy.

Cheers,

Steffen

I agree that the maintenance burden should not be put on you and this
kind of ping-pong was definitely a nuisance.  My guess why the date
re-appeared is that the workflow of Steffen was:  Hmmm, there is no
OMICtools entry inside the packaging but I've found one and simply added
it without checking the history.  I hope the current entry will prevent
this in future.

Thanks for your patience and your explanation

      Andreas.



Reply to: