Hi David, On 13.12.18 13:29, Andreas Tille wrote:
Hi David, On Thu, Dec 13, 2018 at 12:15:11PM +0000, Carnë Draug wrote:On Thu, 13 Dec 2018 at 08:11, Andreas Tille <andreas@fam-tille.de> wrote:I noticed that you reverted a commit by Steffen Moeller in imagej adding an OMICtools identifyer. For the moment I do not think it is nice to simply remove the work of fellow DDs without a consensus how to deal with these data - thus I reverted that remove for the moment.Please revert it again. I did not remove it because I'm disliking omics. I reverted it because it's wrong. I did it the first time during the summer: https://salsa.debian.org/med-team/imagej/commit/415ff687c5 But it was added again. I removed it yesterday for the same reason: https://salsa.debian.org/med-team/imagej/commit/a40be89995 In both cases I have explained on the commit message why it was wrong.Uhmmm, sorry. I should have read the full commit message. I just have read your e-mail here and have seen your last commit. Sorry for the noise. I've now droped a Comment inside the YAML file (and I cross fingers that my importer code is robust enough to not stumble about it ;-) ).
The OMICtools entry is about all versions. Just have a look at the references to the literature they give. You can argue that it should have two entries for two major versions. I don't see the need for that, I must admit. In my reading, the assignment was/would be just fine. The inaccuracies is not our's, it is OMICtools. And many, me included, in this case regard it as a feature.
Apologies for having readded the OMICtools ref, if I have, no idea. And I don't want to look back. I would nonetheless appreciate if you have some sleep over it and possibly decide to add it back in. :o)
In general I do not see any need to remove these data. We should try to find some consensus how to deal with this situation. Once we have this consensus we can simply switch of the display of the OMICtools links on our tasks pages (which is the only use I'm aware of) or even do not import it into UDD.
They are in the UDD.
This will effectively solve the problem you mentioned without wasting the work of some team mates who have spent hours to gather the data. Simply assume OMICtools might change their policy. Do you want to re-add all the data to the packaging information? My own position to the thing is: 1. We should talk to OMICtools people (a good chance might be the Debian Med sprint)
We have. And to the ones of bio.tools who even hosted our 2014 sprint.
2. There are other kind of non-free data (we are linking to publications and some of these are hidden behind a pay-wall) However, all information we provide can be gathered without paying and I would consider the pure IDs as free data. 3. We try to build a system that is valuable for our users including users who are willing to pay for some service provided by third party.Yeah, but there's still a cost of maintaining the metadata. And one can't maintain it properly if one can't access the data to check. For example, yesterday I couldn't check if the ID on omics was correct. I knew it was incorrect because it was the same that I had removed earlier in the year. But if this had happened a few months ago, when I had to check it for the first time, the package would still be with incorrect metadata. Anyway, I found out that the data in the omics platform is not that freely available. And takes time to gather that data and fill the metadata files on debian packages. That's basically what I wanted to pass on the first email. I'm not making a call to remove the data, just passing on the message and people can decide if they want to spend time on it.
That is exactly how it works. It is all optional. To answer your original question about the use:* registries have a way to link back to Debian from their entries, increasing the awareness of a software and its availability
* identification of a software across platforms - which works also without the registry itself, e.g. in recent journal publications
I personally like browsing through the registries a lot. Very educative. How exactly these identifiers will be used yet nobody really knows, I think. Workflows one hears everywhere. Descriptions of containers/cloud images come to mind. The data or the infrastructure made public? bio.tools was just released with an Open Source license - let's see how this all develops.
Our task pages are not completely dissimilar to what the registries are offering. And if we want to add features to those then eventually we end up with what these registry folks are trying to establish today. So I see these domain-specific registries as a glimpse of the future of how we could have portals to our distribution.
To me it is all an experiment. We have three (and more) competing initiatives: SciCrunch, bio.tools and OMICtools. Let us see what ideas they come up with. The time for us to annotate d/u/metadata is negligible compared with the effort to mimic what these three already have. Just enjoy.
Cheers, Steffen
I agree that the maintenance burden should not be put on you and this kind of ping-pong was definitely a nuisance. My guess why the date re-appeared is that the workflow of Steffen was: Hmmm, there is no OMICtools entry inside the packaging but I've found one and simply added it without checking the history. I hope the current entry will prevent this in future. Thanks for your patience and your explanation Andreas.