Bug#917478: popularity-contest: Improve performance (4x faster)
On Sun, Jan 27, 2019 at 01:19:26PM +0100, Bill Allombert wrote:
> On Thu, Dec 27, 2018 at 11:19:45PM +0100, Benoît wrote:
> > Package: popularity-contest
> > Version: 1.67
> > Severity: minor
> > Tags: patch
> >
> > Dear Maintainer,
> >
> > For each installed packages, popcon globs the complete list of files
> > in /var/lib/dpkg/info.
> > This is very slow as I noticed that popcon takes more than a minute of CPU
> > time on my modest laptop, which is enough to start the fan.
> >
> > I'm attaching a patch that lists only once /var/lib/dpkg/info and associates
> > each .list file with a package.
> >
> > I don't see any difference in /usr/sbin/popularity-contest output.
> > And the CPU time goes from 1min08s to 0min14s.
>
> Hello Benoît,
>
> Thanks for your patch!
>
> I tried it and it output warnings for multiarch packages which have both
> a amd64.list and a i386.list like
>
> /var/lib/dpkg/info/gcc-4.8-base:amd64.list
> /var/lib/dpkg/info/gcc-4.8-base:i386.list
>
> I get:
> Use of uninitialized value $_ in open at ./popularity-contest line 146,
> <PACKAGES> line 285.
>
> Do you understand what happen ?
>
Not really.
But I can see that multiarch packages are processed several times.
And the part i don't understand is that processing one deletes it's files
list, hence the undef.
Adding a simple check solves this.
> Cheers,
> --
> Bill. <ballombe@debian.org>
>
> Imagine a large red swirl here.
--
Benoît Dejean
--- /usr/sbin/popularity-contest 2018-08-09 20:41:19.000000000 +0200
+++ ./popularity-contest 2019-02-10 10:25:14.353413546 +0100
@@ -119,6 +119,19 @@
close DIVERSIONS;
}
+my %pkgs_files = ();
+
+if (opendir(my $DPKG_DB, $dpkg_db))
+{
+ for my $e (readdir($DPKG_DB)) {
+ if ($e =~ m/^([^:]+) .*? \. list$/x) {
+ $pkgs_files{$1} ||= [];
+ push @{$pkgs_files{$1}}, "$dpkg_db/$e";
+ }
+ }
+ closedir($DPKG_DB);
+}
+
# Read dpkg database of installed packages
open PACKAGES, "dpkg-query --show --showformat='\${status} \${package}\\n'|";
while (<PACKAGES>)
@@ -127,8 +140,10 @@
my $pkg=$1;
my $bestatime = undef;
my $list;
+ # dpkg-query reports multiple times the same package for diff archs
+ next if $popcon{$pkg};
$popcon{$pkg}=[0,0,$pkg,"<NOFILES>"];
- foreach ("$dpkg_db/$pkg.list", glob("$dpkg_db/$pkg:*.list"))
+ foreach (@{$pkgs_files{$pkg}})
{
open FILES, $_ or next;
while (<FILES>)
Reply to: