On Wed, Jan 19, 2011 at 10:54:44AM +1100, Silvio Cesare wrote: > I have generated a list of roughly equivalent packages between Linux > distributions (currently Debian 5 and Fedora 13). The list is > automatically generated. [...] Hi Silvio, thank you for your work, it is extremely valuable work. I'm currently at a cross-distro meeting on app installers[1] and it's precisely something we've been working on today. I'd be greatly interested to exchange algorithms with you. The main use case we have in mind is to be able to fall back on other distros when a package doesn't have some piece of information. For example: - does package $foo have a screenshot in Debian? - if no, how about in Fedora? - if no, how about in OpenSUSE? - if no, how about in Mandriva? The example uses screenshots, but it could be other kinds of metadata, like categories (it's a way for example to port at least some of Debtags to other distros), ratings or user comments. The euristics I've been implementing so far are: - trivial package name matching - 'stemming' specific kinds of package names (debian:lifoo-dev->foo; fedora:foo-devel->foo) - matching packages that contain the same .desktop files or the same pkg-config files - similarity matching of file lists I still don't have results because the implementation is not complete, but I should have something in a day or two. You have something *today*, which is, wow. Tomorrow (Friday) I'll download your dataset and try to add another euristic that just uses it. It'll also be interesting to use all these methods to cross-validate each other. [1] http://distributions.freedesktop.org/wiki/Meetings/AppInstaller2011 Ciao, Enrico -- GPG key: 4096R/E7AD5568 2009-05-08 Enrico Zini <enrico@enricozini.org>
Attachment:
signature.asc
Description: Digital signature