[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Mining popocon data



Le Wed, Nov 21, 2007 at 05:29:04AM -0800, Rudi Cilibrasi a écrit :
> We would need:
> 
> 1) The total number of machines, M, using popcon.
> 2) A count of machines that have package i installed F(i) for all packages i
> 3) A count of machines that have both package i and j installed F(i,j)
> for all packages i, j

>From reading popcon.debian.org, I have the impression that one has to be
DD to access the full database. I am also very interested in mining
popcon, especially because it may involve similar tools as what my
research is needing those days. Shall we enquire wether recommendation
plus GPG-signing of a non-disclosure agreement would allow us to access
the data? But maybe before this, we should read what is promised to the
participants of popcon. If it is written "DD only", whatever the
goodwill of the popcon admins, we can not ask… (And were I popcon
developper, this is the kind of thing I would promise to the
participants.).


> Using these three statistics, we can compute something that can relate
> the typical use of one package to another and put packages in larger groups
> for context.  This is the type of analysis that libcomplearn and libqsearch
> enable [1].

And draw post-genomic-style networks… I would really enjoy…

Have a nice day,

-- 
Charles



Reply to: