Re: UDD gatherer for DDTP translations (Was: Extended descriptions size)
On Tue, Apr 7, 2009 at 9:49 AM, Andreas Tille <email@example.com> wrote:
> Well, I did not said that it is actually hard and in UDD you can get this
> easily by
> SELECT md5(description || E'\n' || long_description || E'\n' ) AS md5
> FROM packages WHERE ...
Ok, I see why you're having trouble now; you're splitting up the
description in your DB and thus need to stick it back together. That
does indeed make the process a bit less reliable. The DDTP/DDTSS
treats the description as a single string, the exact string in the
Packages file (the Description field is a single entry in the file) so
we had no issues. By doing extra processing like splitting/stripping
parts of the string it's quite possible you're doing a not invertible
conversion, which would make matching later harder.
It'd be nice if someone went over the version number stuff in
DDTP/DDTSS since by and large it was never used (user display only and
even then it wasn't accurate) and so probably there's plenty of work
It might actually be easier to write a script which simply collected
Packages files from say snapshot.debian.org, calculated all the MD5
sums (you can extract the description field using a regex so it's easy
enough in Perl) and built a database of description MD5s and version
numbers. That would give a reliable mapping, far more reliable than
the DDTP/DDTSS is ever likely to do.
Keep in mind that all dpkg frontends with description only work on the
basis of the complete description string, I'm not sure if anyone is
likely to switch to using versions.
Have a nice day,
Martijn van Oosterhout <firstname.lastname@example.org> http://svana.org/kleptog/