Re: Gathering package upstream meta-data in the UDD. (was: Re: more formally indicating the registration URL)
- To: Debian QA List <debian-qa@lists.debian.org>
- Cc: debian-blends@lists.debian.org
- Subject: Re: Gathering package upstream meta-data in the UDD. (was: Re: more formally indicating the registration URL)
- From: Charles Plessy <plessy@debian.org>
- Date: Sat, 6 Feb 2010 22:02:04 +0900
- Message-id: <20100206130204.GA27756@kunpuu.plessy.org>
- In-reply-to: <20100121150719.GB6206@an3as.eu>
- References: <20100112013826.GB32707@kunpuu.plessy.org> <20100112071247.GA26562@an3as.eu> <20100118005931.GB16674@kunpuu.plessy.org> <20100118110517.GB26360@an3as.eu> <20100118230819.GE26132@kunpuu.plessy.org> <20100119075804.GB15712@an3as.eu> <20100119135148.GA11328@kunpuu.plessy.org> <20100119142051.GA30267@an3as.eu> <20100121145431.GD3723@kunpuu.plessy.org> <20100121150719.GB6206@an3as.eu>
Le Thu, Jan 21, 2010 at 04:07:19PM +0100, Andreas Tille a écrit :
> On Thu, Jan 21, 2010 at 11:54:31PM +0900, Charles Plessy wrote:
> >
> > I will try to provide drafts for
> > the loading in UDD. But I never programmed in Python, so I do not expect it
> > will work out of the box. Hopefully, it will save you some typing.
>
> That's a good way to push me for helping you instead of waiting until I
> find time to do it from scratch. Just ask in case of trouble.
Hi Andreas and everybody,
today I took a couple of hours to study the UDD and python (and snakes and
Greek mythology, thanks to the Wikipedia syndrome). I attached to this email a
draft for a bibliographic reference gatherer, “bibref_gatherer.py”.
Although in my previous emails I described a tab-delimited export format from
the upstream-medadata.d.n system, I realised that this is not robust in case
one field unfortunately contains a tab. Instead of re-inventing the wheel with
quoting mechanisms, I simply switched the exchange format to YAML.
http://upstream-metadata.debian.net/for_UDD/biblio.yaml
The above files contains triples to be loaded in a table of the UDD. They
provide the information needed to feed the Blends web sentinel with
bibliographic information.
Since I do not run a local copy of the UDD, I did not test the attached
gatherer. Please treat it as a stub. It is meant to be used with the following
patch to the UDD configuration file.
Index: config-org.yaml
===================================================================
--- config-org.yaml (révision 1680)
+++ config-org.yaml (copie de travail)
@@ -19,6 +19,7 @@
ddtp: module udd.ddtp_gatherer
ftpnew: module udd.ftpnew_gatherer
screenshots: module udd.screenshot_gatherer
+ bibref: module udd.bibref_gatherer
dehs: module udd.dehs_gatherer
ldap: module udd.ldap_gatherer
wannabuild: module udd.wannabuild_gatherer
@@ -528,6 +529,14 @@
table: screenshots
screenshots_json: /org/udd.debian.org/mirrors/screenshots/screenshots.json
+bibref:
+ type: bibref
+ update-command: /org/udd.debian.org/udd/scripts/fetch_bibref.sh
+ path: /org/udd.debian.org/mirrors/bibref
+ cache: /org/udd.debian.org/mirrors/cache
+ table: bibref
+ bibref_yaml: /org/udd.debian.org/mirrors/bibref/bibref.yaml
+
wannabuild:
type: wannabuild
wbdb: "dbname=wanna-build host=localhost port=5433 user=guest"
Please tell me what you think about it, and if you would like me to commit the
whole to the UDD sources.
Have a nice week-end,
--
Charles Plessy
Debian Med packaging team,
http://www.debian.org/devel/debian-med
Tsurumi, Kanagawa, Japan
#!/usr/bin/env python
"""
This script imports bibliographic references from upstream-metadata.debian.net.
"""
from gatherer import gatherer
from sys import stderr, exit
online=0
def get_gatherer(connection, config, source):
return bibref_gatherer(connection, config, source)
class screenshot_gatherer(gatherer):
"""
Bibliographic references from upstream-metadata.debian.net.
"""
def __init__(self, connection, config, source):
gatherer.__init__(self, connection, config, source)
self.assert_my_config('table')
my_config = self.my_config
cur = self.cursor()
query = "DELETE FROM %s" % my_config['table']
cur.execute(query)
query = """PREPARE bibref_insert (text, text, text) AS
INSERT INTO %s
(package, key, value)
VALUES ($1, $2, $3)""" % (my_config['table'])
cur.execute(query)
pkg = None
def run(self):
my_config = self.my_config
#start harassing the DB, preparing the final inserts and making place
#for the new data:
cur = self.cursor()
bibref_file = my_config['bibref_yaml']
fp = open(bibref_file, 'r')
result = fp.read()
fp.close()
for res in safe_load_all(result):
package, key, value = res
query = """EXECUTE bibref_insert
(%(package)s, %(key)s, %(value)s)"""
try:
cur.execute(query, res)
except UnicodeEncodeError, err:
print >>stderr, "Unable to inject data for package %s. %s" % (res['name'], err)
print >>stderr, "-->", res
cur.execute("DEALLOCATE bibref_insert")
cur.execute("ANALYZE %s" % my_config['table'])
if __name__ == '__main__':
main()
# vim:set et tabstop=2:
Reply to: