[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: "People who installed X also have packages Y, Z and T installed"



On Fri, Mar 02, 2007 at 11:00:56PM +0100, Martin Zobel-Helas wrote:

> > Although I think the idea is nice, I don't think the current data is all 
> > that usable. Some examples of "links" that IMO are completely useless in 
> > practice, just from the top of the file and not even complete for the 
> > selected packages:
> > 3dchess: kworldclock
> > 3ddesktop: module-assistant, devscripts
> > 915resolution: linux-image-2.6.18-3-686
> > 9base: xserver-xephyr, libxml2-dev
> > IMO some heavy filtering needs to be done for this data to be anything 
> > more than a toy and publishable.
> i agree with you that we need to do some data filtering here. OTOH this
> intends to be an "amazon"like feature, where not every package needs to
> stand in any relation to the other package. What needs to be done is
> some filtering, so not every package lists libfoo or bar-common, but i
> guess Enrico has already done so.

So far I've seen two causes for bad suggestions:

 1) Suggestions for a package that is too popular tend to be
    meaningless: this is because when I query Xapian with, for example,
    "please give me 20 typical systems that have 'grep' installed", I
    get random systems as all systems have grep installed.
    This *might* be detectable looking at the Xapian's relevance
    estimate, which I'd expect to be low in cases like this.

 2) Packages being partially tagged.  I normally filter out all the
    libfoobar0 and foo-common packages using this tag expression:
    '!role::shared-lib && !role::app-data'
    but this filtering fails if a -common package, for example, is not
    well tagged and does not have the role::app-data tag.
    This is, afaict, only solvable by fixing the tags.

I'd be happy if people could find out more.

> Perhaps you can give some idea on how you would implement a better
> filtering.

> PS: I would really like to see that feature on packages.debian.org :)

Me too.  I wouldn't actually mind if we just added it as it is, marked
*experimental*, and see where we can go from there.


Ciao,

Enrico

-- 
GPG key: 1024D/797EBFAB 2000-12-05 Enrico Zini <enrico@debian.org>

Attachment: signature.asc
Description: Digital signature


Reply to: