[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: UDD schema for new queue

On Fri, 13 Feb 2009, Lucas Nussbaum wrote:

-- Sources
CREATE TABLE new_sources (
       source text,
       version text,
       maintainer text,
       maintainer_name text,
       maintainer_email text,
       bin text,                  -- by parsing http://ftp-master.debian.org/new/<src>_<version>.html#dsc field "Binary:"

call it binaries? ask ftpmasters to export it to deb822?

It's fine for me to use binaries instead of bin.
I asked Gannef yesterday in IRC and he would accept a patch to DAK
(https://ftp-master.debian.org/git/dak.git/).  I had a look into
which creates the html pages of the new quere and I learned that
it will be much less effort for me to parse the html result than
to dive into DAK internals.  So if you are able to convince
ftpmaster to export to deb822 this would be definitely a better
solution but I can not spend the time I would need for the "right"

       closes int,                -- WNPP bug #

I think that a given NEW upload can close several bugs. (Think of
packages in NEW because of new binary packages, not just new source

Sure.  But I would check UDD for the bug number whether it is
a bug of wnpp and contains the magic strings ITP and the
name of the package.  Would you consider this sufficient?

If there is any better method to obtain the fields above than
parsing HTML pages I would be really happy if you could enlighten

ask ftpmasters to export the data you want, if it's already available

My initial mail, your mail and this mail are CC ftpmaster.  The
fact that Gannef told me he would accept a patch lets me assume that
while they are in principle open to include the code but that it does
not yet exist (because Gannef would probably have told me).  So would
you think that a gatherer which parses the HTML pages would do the
job for the moment and can be enhanced later?

Or should we rather start to just move the information of
into UDD for the moment and "wait".

I'm not sure if all the fields are really that useful... But if they are
there, it's true that it's not that hard to import them as well.

I personally do not need most of the details - I'm basically interested
in binary package names, version, homepage, description and long_description.
But if I went so fra the remaining things are just cheap and IMHO fit
the idea of UDD.  If you think they should rather be left out - that's
fine for me.

Kind regards



Reply to: