[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#793404: marked as done (massive waste of CPU time in debian/rules by inline commands)



Your message dated Thu, 03 Jan 2019 11:19:00 +0000
with message-id <ef300e96-64cc-15ff-c1a1-45053c921456@thykier.net>
and subject line Re: Bug#793404: massive waste of CPU time in debian/rules by inline commands
has caused the Debian Bug report #793404,
regarding massive waste of CPU time in debian/rules by inline commands
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@bugs.debian.org
immediately.)


-- 
793404: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=793404
Debian Bug Tracking System
Contact owner@bugs.debian.org with problems
--- Begin Message ---
Package: general
Severity: minor

Hello Fellow Maintainers,

please tell me that I am wrong or that I start fighting the windmills if
that's the case, but I have a general impression that something smells
in lots of debian/rules files nowadays and we need a concept to improve
that.

The problem: I see lots of $(shell ...) stuff. In boost, there are about
12 such calls. And they run dpkg-architecture or dpkg-parsechangelogs or
similar commands. When this was done a just couple of times (i.e. before
dh(7)), that's acceptable. But now, it looks like debian/rules is called
many, many times through dh. Making many, many calls of that inline
commands. Wasting many, many CPU cycles. All that just to retrieve the
same information all over again.

In the emulated m68k environment, it spends about half an hour (guessed,
not measured) before starting the actual build, doing things like:

|      \_ /usr/bin/perl -w /usr/bin/dh build --with python2 --with python3
|          \_ /usr/bin/make -f debian/rules override_dh_auto_configure
|              \_ /bin/sh -c dpkg-parsechangelog | grep Version | cut -d' ' -f2
|                  \_ /usr/bin/perl /usr/bin/dpkg-parsechangelog
|                  |   \_ /usr/bin/perl /usr/lib/dpkg/parsechangelog/debian -ldebian/changelog --file debia

I think we need to find a shortcut for this. The best idea I got after
short brainstorming is hacking fakeroot to provide a cache of stdout
data from certain commands.

If someone has a better idea or could point to an existing concept or
implementation, please tell me.

Regards,
Eduard.

--- End Message ---
--- Begin Message ---
Niels Thykier:
> Hi,
> 
> In the past 3½ years, several things have been improved and I am
> therefore taking the liberty of closing this bug against general
> (remaining issues as I understand it will be in individual packages).
> 

(with an actual -done this time; sorry for the noise).

> In particular, I think we have identified all major issues, solved most
> of them and triaged/assigned the remaining ones.
> 
> For individual packages, improvements are often a question of:
> 
>  * Migrating to "Rules-Requires-Root: no" where possible.
>  * Avoid calling dpkg-parsechangelog directly and instead use dpkg's
>    makefile snippets.
>  * Replace "comment only" override targets with completely empty
>    override targets.
> 
> (Not necessarily listed in order of "best performance for value" as that
> depends on exactly what the package does)
> 
> On Fri, 24 Jul 2015 12:19:36 +0200 Jakub Wilk <jwilk@debian.org> wrote:
>> * Jonas Smedegaard <dr@jones.dk>, 2015-07-23, 21:40:
>>>> One mistake boost makes is using ":=" instead of plain "=". Contrary 
>>>> to popular belief, the former almost always causes more evaluation of 
>>>> $(shell) stuff, specially when dh is involved.
>>> Could you elaborate on that?
>>
>> dpkg-buildpackage -B will run debian/rules 4 times: once to determine if 
>> build-arch exist, and once for every target: clean, build(-arch), 
>> binary-arch.
>>
>> dh adds even more debian/rules invocations. It runs it once every target 
>> (clean, build(-arch), binary-arch), and once for every override.
>>
>> So your ":=" variable will be evaluated 4 times, or 7+N times if you use 
>> dh.
>>
>> "=" variables will be evaluated only when they are used, which is less 
>> than 4 or 7+N in most cases.
>>
>> -- 
>> Jakub Wilk
>>
>>
> 
> This issue is still prevalent but we now have support for reducing the
> number of calls to debian/rules, which is "opt-in" for packages at the
> moment.
> 
> With "Rules-Requires-Root: no" (recently added to policy), you can
> remove two debian/rules invocation on the dpkg side and one on the dh
> side.  Which brings us down to:
> 
>   * 2 calls from dpkg (one for clean and one for the binary target)
>   * 2+N calls for dh to probe the rules file for override targets plus
>     the N (non-empty) overrides a package might have.
> 
> 
> Guillem Jover wrote:
>> Control: block -1 by 793330
>>
>> Hi!
>>
>> [...]
>>
>> There are multiple culprits that pile up here:
>>
>> 1) The /usr/share/dpkg/architecture.mk and /usr/share/dpkg/buildflags.mk
>>    lazy and caching value initialization is not effective. I had noticed
>>    it but had not yet checked if it was a problem with the makefiles or
>>    in make, etc. It appears is a bad interaction with the foreach, which
>>    defeats the lazy and cached evaluation. I guess I'll try to make the
>>    foreach work, or revert to an unrolled implementation.
>>
> 
> I believe this has been fixed now.
> 
>> 2) debhelper's Dh_Lib.pm does not try to use existing dpkg-architecture
>>    variables from the environment. Those should not be expected to be
>>    present, but when using dpkg-buildpackage they will be present so it
>>    would be an optimization. I'll file a bug report about this.
>>
> 
> This has been fixed now.
> 
>> 3) Slow dpkg-parsechangelog implementation and usage:
>>
> 
> Lintian now flags use of dpkg-parsechangelog in most cases and
> recommends people to migrate to the optimized makefile snippets in dpkg.
> 
> https://lintian.debian.org/tags/debian-rules-parses-dpkg-parsechangelog.html
> 
>>> In the emulated m68k environment, it spends about half an hour (guessed,
>>> not measured) before starting the actual build, doing things like:
>>>
>>> |      \_ /usr/bin/perl -w /usr/bin/dh build --with python2 --with python3
>>> |          \_ /usr/bin/make -f debian/rules override_dh_auto_configure
>>> |              \_ /bin/sh -c dpkg-parsechangelog | grep Version | cut -d' ' -f2
>>> |                  \_ /usr/bin/perl /usr/bin/dpkg-parsechangelog
>>> |                  |   \_ /usr/bin/perl /usr/lib/dpkg/parsechangelog/debian -ldebian/changelog --file debia
>>
>> 3.1) As mentioned in the thread, callers can avoid the other shell
>>      commands and pipes by using -S.
>>
> 
> (Will be handled by the lintian tag)
> 
>> 3.2) debian/rules (or debhelper/cdbs) will still call the program for
>>      different changelog values. But dpkg-buildpackage has to parse the
>>      current and previous entries anyway, so we could preset values for
>>      those in the environment that could opportunistically be used by
>>      debian/rules and debhelper/cdbs. A possible drawback is that
>>      packages might accidentally rely on those variables w/o setting
>>      them beforehand.
>>
> 
> This has not been implemented.  However, debhelper has implemented two
> features to improve performance here:
> 
>  * debhelper now uses the Dpkg module internally instead of forking
>    dpkg-parsechangelog, which reduces a bit of runtime.
> 
>  * debhelper now caches the result from d/changelog so we at most parse
>    that file once per helper needs to know the version ($dh{VERSION}).
>    (Down from "one per binary package built per helper needing
>    $dh{VERSION}")
> 
> In the default sequence, the only (non-error) case of $dh{VERSION}
> appears in dh_makeshlibs in the -VUpstream-Version case (which is the
> default).  Thus only source packages with shared libraries are affected
> (there is considerably amount of those but it is far from "every package").
> 
>> 3.3) dpkg-parsechangelog supports other changelog formats, and those
>>      are implemented by external parsers. This means it needs to scan
>>      the changelog twice, and then parse+output+parse the data from
>>      the parser. I've already implemented an optimization (to be
>>      included in dpkg 1.18.2) when forcing the format to be debian,
>>      that uses a builtin parser, which halves the execution time.
>>      «dpkg-parsechangelog -Fdebian». I guess I can take this further
>>      and use the builtin parser whenever the format is debian.
>>
> 
> To my knowledge, Guillem has implemented an improvement that makes the
> default case faster.
> 
> Thanks,
> ~Niels
> 

--- End Message ---

Reply to: