Bug#793404: massive waste of CPU time in debian/rules by inline commands
Control: block -1 by 793330
Hi!
On Thu, 2015-07-23 at 19:43:46 +0200, Eduard Bloch wrote:
> Package: general
> Severity: minor
> The problem: I see lots of $(shell ...) stuff. In boost, there are about
> 12 such calls. And they run dpkg-architecture or dpkg-parsechangelogs or
> similar commands. When this was done a just couple of times (i.e. before
> dh(7)), that's acceptable. But now, it looks like debian/rules is called
> many, many times through dh. Making many, many calls of that inline
> commands. Wasting many, many CPU cycles. All that just to retrieve the
> same information all over again.
There are multiple culprits that pile up here:
1) The /usr/share/dpkg/architecture.mk and /usr/share/dpkg/buildflags.mk
lazy and caching value initialization is not effective. I had noticed
it but had not yet checked if it was a problem with the makefiles or
in make, etc. It appears is a bad interaction with the foreach, which
defeats the lazy and cached evaluation. I guess I'll try to make the
foreach work, or revert to an unrolled implementation.
2) debhelper's Dh_Lib.pm does not try to use existing dpkg-architecture
variables from the environment. Those should not be expected to be
present, but when using dpkg-buildpackage they will be present so it
would be an optimization. I'll file a bug report about this.
3) Slow dpkg-parsechangelog implementation and usage:
> In the emulated m68k environment, it spends about half an hour (guessed,
> not measured) before starting the actual build, doing things like:
>
> | \_ /usr/bin/perl -w /usr/bin/dh build --with python2 --with python3
> | \_ /usr/bin/make -f debian/rules override_dh_auto_configure
> | \_ /bin/sh -c dpkg-parsechangelog | grep Version | cut -d' ' -f2
> | \_ /usr/bin/perl /usr/bin/dpkg-parsechangelog
> | | \_ /usr/bin/perl /usr/lib/dpkg/parsechangelog/debian -ldebian/changelog --file debia
3.1) As mentioned in the thread, callers can avoid the other shell
commands and pipes by using -S.
3.2) debian/rules (or debhelper/cdbs) will still call the program for
different changelog values. But dpkg-buildpackage has to parse the
current and previous entries anyway, so we could preset values for
those in the environment that could opportunistically be used by
debian/rules and debhelper/cdbs. A possible drawback is that
packages might accidentally rely on those variables w/o setting
them beforehand.
3.3) dpkg-parsechangelog supports other changelog formats, and those
are implemented by external parsers. This means it needs to scan
the changelog twice, and then parse+output+parse the data from
the parser. I've already implemented an optimization (to be
included in dpkg 1.18.2) when forcing the format to be debian,
that uses a builtin parser, which halves the execution time.
«dpkg-parsechangelog -Fdebian». I guess I can take this further
and use the builtin parser whenever the format is debian.
And probably some others…
Thanks,
Guillem
Reply to: