Re: Tasks pages (close to) fixed; Bibref does not seem to be updated automatically
Le Mon, Feb 20, 2012 at 09:43:09AM +0100, Andreas Tille a écrit :
>
> there are tools which assemble informations for Sources.gz files - I guess
> this could be implemented if say 20% of the packages will contain such a
> file.
In such a model, the packages need to be uploaded so that Sources.gz is
updated. This is exactly what I aim at avoiding by feeding the UDD with
Umegaya.
> > This is why I designed a push model. After updating debian/upstream for the
> > package 'foo', visit http://upstream-metadata.debian.net/foo/YAML-URL, and
> > Umegaya will refresh its information. (This will work after I transfer the
> > service to debian-med.debian.net; I really hope to do it this evening).
>
> I admit I do not trust that a developer will really do regular visits to
> http://upstream-metadata.debian.net/foo/YAML-URL or any similar URL.
Note that anybody can trigger a refresh. For instance, I ran this command to
load all the upstream metadata for the packages known by debcheckout, and that
are recommended by one of our tasks.
for package in $(svn cat svn://svn.debian.org/blends/projects/med/trunk/debian-med/debian/control | grep Recommends | sed -e 's/,//g' -e 's/|//g' -e 's/Recommends://g' ); do curl http://upstream-metadata.debian.net/$package/Name ; done
I can set up a cron job along these lines, in addition to VCS hooks.
> BTW, it came to my mind that we should also gather
> fields from debian/copyright if it is DEP5 compatible. I specifically
> consider Upstream-Contact a very valuable field and at a later stage I
> would even ask for a lintian check "Upstream-Contact is missing" or
> something like this.
I actually opposed - with no success - the includsion of the Upstream-Contact
and Upstream-Name fields in DEP 5 as they usually do not contribute to respect
the package's redistribution terms, with is the purpose of the Debian copyright
file.
The debian/upstream file features Contact and Name fields that can be used
for the same purpose.
> 1. scripts/fetch_bibref.sh
> fetches all available debian/upstream files and move them to
> /org/udd.debian.org/mirrors/upstream/package.upstream
> I would like to stress the fact that I would fetch these
> files *unchanged* as they are edited by the author
> 2. udd/bibref_gatherer.py
> Just parse the upstream files for bibliographic information
> and push them into UDD
> This is the really cheap part of the job and I volunteer to
> do this in one afternoon.
The problem with this approach is that it can only run on udd.debian.org,
which is quite loaded if I understand well.
Regardless the mean, I provide a table that can be downloaded daily and that
can be loaded in the UDD. That is how the gatherers work as I have seen so
far. That the data transits in a Berkeley DB is just a detail. It is as
unimportant as having the data processed with one programming language or
another. What matters is the final product, the table to be loaded.
> However, regarding practical usage of these data I do not see
> an application currently. You need a problem first which needs to be
> solved to invent something new.
The goal of the sytem is:
- Let the maintainer update the data without uploading the package,
- Gather data for our tasks pages. In addition to the bibliography,
I think that, while rare, the Registration and Donation fields
can be very useful to better cooperate with Upstream.
http://upstream-metadata.debian.net/table/registration
http://upstream-metadata.debian.net/table/donation
> dh_bibref
>
> which turns debian/upstream data into a usable BibTeX database on the
> users system. This is technically definitely not hard - it just needs
> to be *done*.
The challenge will be to have it ran by default by Debhelper. But
I think that indeed it is the good direction. In the meantime, such
a tool will need to produce a reference that is stored in the directory.
> A. Gather *all* existing debian/upstream files and making sure they
> will be updated after at least 24h at a place where they can be
> fetched for UDD (I explicitely do not mention that we should do this
> via the web service and I would really prefer not to go the detour
> of another database)
Currently I have the following cron job running on debian-med.debian.net:
@hourly for key in DOI PMID Reference-Author Reference-Eprint Reference-Journal Reference-Number Reference-Pages Reference-Title Reference-URL Reference-Volume Reference-Year References; do curl -s http://upstream-metadata.debian.net/yaml/$key; done > public_html/biblio.yaml
Therefore, the bibliographic data can now be accessed at the following URL.
http://upstream-metadata.debian.net/~plessy/biblio.yaml
[You may need to wait a bit for the DNS to propagate the new
IP for upstream-metadata.debian.net]
Let's see how it goes before deciding to redo everyghing from scratch with a
new design.
Cheers,
--
Charles
Reply to: