[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

[Popcon-developers] popcon written in python



> Hello popcon-devs,
> 
> I have been working on a re-write of the popularity-contest application.
>  This originally started off as being a learning exercise to both test my
> reading of perl code and to have a new project for some python code.  I am
> mostly happy with the current state of what I am calling pypopcon and
> wanted to share my work:
> https://github.com/drwahl/pypopcon
> 
> There are some interesting (to me, at least) changes to this product as
> compared to the currently used popularity-contest code.

Hello David,

For reference: I (re)wrote popularity-contest in basic perl because perl-base is
Essential: yes, so that installing popcon does not affect the popularity
of other packages.

For that reason, it seems unlikely that your script report the same results as the
standard popularity-contest perl script.

> First, pypopcon explores all the files that a package provides and checks
> for atime (instead of just a key binary a package provides).  I believe
> this increases the accuracy as some package ship with multiple binaries and
> the one that popularity-contest uses isn't always the most used binary from
> the package.

There are various files which atimes are changed without users action, for example
by cron jobs, dpkg hooks. Including the files in the list means that the
package atime became meaningless.  For example, all shared libraries,
all python modules, etc. Thus we use a regex to limit the list to 'safe' files.

Instead the popcon backend use the dependency graph to mark as voted all packages
that are depended on by voted packages (transitively).

> Secondly, pypopcon is showing a pretty decent performance increase (and
> there is still room for more).  On my system, popularity-contest takes
> about 15 seconds to run whereas pypopcon is taking about 8 seconds to run.
>  One thing that is interesting about this metric is that pypopcon is
> actually getting the atime/ctime of more files than the perl
> popularity-contest script, so it's actually doing more work than
> popularity-contest is, and it is doing it in less time.

You need to split system time and user time in your benchmark.
The system time is very much dependent on file system performance.

Cheers,
-- 
Bill. <ballombe at debian.org>

Imagine a large red swirl here. 



Reply to: