[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#699083: lintian: optimize t/tests suite



On 2013-02-23 00:26, Niels Thykier wrote:
>>> >> Running Lintian on all our t/tests artifacts[3], this patch reduces the
>>> >> user time with about 6 minutes.  That is a drop from ~21 to ~15 minutes
>>> >> in time's output.  Real time seems to have dropped to 5m 30s from 6m
>>> >> {0..30}s.
>>> >>
>> > 
>> > This seems to have dropped to (still at -9):
>> > 
>> > [...]	
> 2m31.5s / 1m29.2s / 1m31.3s
> 
> 
> This was done at master being f14d0f7f, but as I recall the timings
> haven't changed at lot since fa546e3 + 76f1c80 (Jan 31st).
> 
> ~Niels

I tried doing this with -dd and used the attached script to list the
slowest collections by /average/ wall time[1].

$ ./sum-times.pl dd-t-tests-p6.log | head
                              || Count ||  Avg (ms)  || Total (s)
       coll/unpacked [source] ||   351 || 190.028892 || 66.700
coll/bin-pkg-control [binary] ||   487 ||  46.338446 || 22.567
          coll/index [binary] ||   487 ||  42.489078 || 20.692
      coll/file-info [binary] ||   487 ||  34.479538 || 16.792
      coll/file-info [source] ||   351 ||  32.337590 || 11.350
 coll/hardening-info [binary] ||   487 ||  27.380244 || 13.334
   coll/objdump-info [binary] ||   487 ||  23.571448 || 11.479
        coll/md5sums [binary] ||   483 ||  22.835720 || 11.030
       coll/unpacked [binary] ||   487 ||  20.396949 ||  9.933
$ ./sum-times.pl dd-t-tests-p6-2.log | head
                              || Count ||  Avg (ms)  || Total (s)
       coll/unpacked [source] ||   351 || 186.312866 || 65.396
coll/bin-pkg-control [binary] ||   487 ||  46.308622 || 22.552
          coll/index [binary] ||   487 ||  42.414431 || 20.656
      coll/file-info [source] ||   351 ||  42.363595 || 14.870
      coll/file-info [binary] ||   487 ||  33.938706 || 16.528
 coll/hardening-info [binary] ||   487 ||  27.201657 || 13.247
   coll/objdump-info [binary] ||   487 ||  23.908593 || 11.643
        coll/md5sums [binary] ||   483 ||  22.639464 || 10.935
       coll/unpacked [binary] ||   487 ||  20.975357 || 10.215


unpacked [source] is x3 times slower than the 2nd runner up.  I am also
a bit surprised hardening-info+objdump-info are so high, given that a
quick check suggests only 50-60 tests seems to have c-code in them.


If we were preload the Dpkg::Source::Package module (i.e. "use" instead
of "require") and use that module for extraction, the "coll/unpacked"
for source packages drop to about (2 different runs):

       coll/unpacked [source] ||   351 ||  66.030687 || 23.177
       coll/unpacked [source] ||   351 ||  68.872256 || 24.174

But for optimizing the test suite, this is mostly irrelevant[2].


objdump-info appears to have a "no-op" runtime of 0.015s-0.03s vs
hardening-info, which is down to 0.005s-0.015s.  I can drop the
objdump-info no-op runtime to about 0.002-0.004[3], but the "Total (s)"
only changes about a 1s for that.

Being stuck, I had a look at it from the opposite side.  If we are to
remove just a single minute of runtime, we have to take out about 0.171s
per time Lintian is invoked[4].
  So if we want the test suite under 30m user time, we probably have to
start optimizing outside Lintian itself.  In my experience, the test
runner (i.e. t/runtests) does not cause a real overhead on its own.  So
either it is time to optimize the package building process in specific
tests or in general (e.g. dpkg-gencontrol).

~Niels

PS: If you are wondering about md5sums being run 483 vs. 487 for
bin-pkg-control - the difference is caused by 4 udebs.


[1] NB: I increased the precision to "6" in $finish_timer in the
frontend - otherwise the error margin by summing would be:
   +/-0.001s * N  (or 351ms for any [source] entry)

That is, line 805-809 in frontend/lintian becomes:
"""
           $finish_timer =  sub {
                my ($start) = @_;
                my $diff = tv_interval ($start);
                return sprintf (' (%.6fs)', $diff);
            };
"""

[2] What we win is by not having to reload the Dpkg modules at each
unpack.  But the test suite normally runs lintian once per artifact, so
major gain is lost.

On a slightly related topic, the current "quiet" method used with
Dpkg::Source::Package does not quite work.  So this style also breaks 2
tests in t/tests... bah.

[3] By lazily starting the "xargs $helper | gzip" pipeline like I did
for coll/hardening-info and coll/strings in commit 76f1c80.

[4] Which pretty much translates to removing the entire cost of running
dpkg-source on every source package in the 351 tests.

Attachment: sum-times.pl
Description: Perl program


Reply to: