Re: make -j in Debian packages
On Fri, Jun 30, 2006 at 03:22:48AM +0200, Goswin von Brederlow wrote:
> The same can't be said for upstream makefiles though. Many sources
> don't build with -j option. I'm not sure if debian/rules should
> somehow enforce -j1 in those cases or if only packages that benefit
> from -jX should add support for some DEB_BUILD_OPTION or
> CONCURENCY_LEVEL env var. The later would always be the save
> option. The former would have lots of hidden bugs that prop up
> suddenly on rebuilds or security fixes.
I'm strictly in favour of being on the safe side. If maintainers would start
now to add -j4 to their makefiles, we would end in more FTBFS bugs, I guess,
which is unnecessary and an extra burden for the porters and buildds.
When we want to introduce some sort of concurrency builds, we should do it
that way that the impact is as small as possible (opt-in). Start with some
packages that are known to build fine with -jX and choose 1-2 buildds on
some archs to implement this feature for a small test.
> Maybe someone should do a complete archive rebuild with -j1 and -j4 on
> a smp system and compare the amount of failures to get an overview how
> bad it would be.
Are there any m68k SMP machines?
No, just joking. ;)
I guess, some amd64 machines are fast enough to get some results in a
reasonable time. For some more tests it would be nice if Martin M. (tbm)
would do a rebuild on his machine park. He already did some archive rebuilds
in the past and is experienced with this.
This should show some pitfalls across the different archs.
If anything goes well, the procedure could be implemented on all buildds.
> > On the other hand, making builds significantly faster is not
> > something that you can shake a stick at. Typically make -jX is faster
> > even on uniprocessor, and I don't need to tell you why it's much
> > faster on SMP.
> > Too bad, a C++ build where every file takes 1GB memory obviously
> > should not be parallelized. Also, no one but the maintainer knows
> > whether a package is SMP-clean or not. You cannot guess this in an
> > automated way.
> That would point to using an env varibale
I doubt that a single env variable would do the trick, as the discussion has
shown. There are some showstoppers for smaller machines like huge C++
packages - at least when all processes ought to run on the local machine.
This is a different matter when such processes would be distributed to
other, more powerful machines with crosscompilers.
> > Thus, my counter-proposal:
> > Let's allow maintainers to use make -jX according to their common
> > sense, requiring obeying an env variable to opt out.
> and let the maintainer pass that env variable down a -jX.
> How about this option:
> We write a tool "concurency-helper" that gets a set of requirements of
> the source and outputs a suitable concurency level for current build
> host. Requirements could include:
> --package pkg
> Let the tool know what we build. The tool can have overrides in
> place for packages in case special rules apply.
> --max-concurency X || --non-concurent
> Limit the concurency, possibly to 1, for sources that have problems
> with it. Although such sources probably just shouldn't support this.
> --ram-estimate X [Y]
> Give some indication of ram usage. If the host has too little ram
> the concurency will be tuned down to prevent swapping. A broad
> indication +- a factor of 2 is probably sufficient. The [Y] would
> be to indicate ram usage for 32bit and 64bit archs seperately.
> Given the pointer size ram usage can vary between them a lot.
> Indicate that there are lots of small files that greatly benefit
> from interleaving I/O with cpu time. Try to use more concurency
> than cpus.
> The tool would look at those indicators and the hosts resources in
> both ram and cpus and figure out a suitable concurency level for the
> package from there.
> What do you think?
Of course this is a more complex approach, but I think it's an approach into
the right direction. I would like to have settings for distcc to get added
(like --use-distcc and --distcc-hosts)
> > Rationale:
> > Nearly every buildd and nearly every user building the packages on
> > his own will benefit from -j2 , even on non-SMP. Unless it's a
> > piece of heavily-templated code, any modern box will have enough
> > memory to handle it. The maintainer know whether the code is heavily
> > templated or not.
> Mips, mipsel, arm and m68k won't benefit. The ram requirement just
> leads to poor cache performance or even excessive swapping in
> general. Sources and gcc are growing and growing and the ram of the
> buildds stays the same.
> On the other hand any modern system will build the source fast and the
> buildd will be idle most of the time so -j2 or not hardly matters.
> Wouldn't that indicate a preference to -j1?
Erm. Most modern systems are not in the need to necessarily speed up the
build process because there are fast enough to keep up easily anyway.
Using parallel makes on these systems are just nice to have for a user who
eventually is rebuilding a package on his/her own system (and of course the
package maintainer itself).
The slower archs would greatly benefit of any speed increase that can be
achieved - but of course it should end in slower builds, because there
resources are limited on those archs. Therefore I would like to see the
possibility added to use distcc on those archs.
Ciao... // Fon: 0381-2744150
Ingo \X/ SIP: email@example.com
gpg pubkey: http://www.juergensmann.de/ij/public_key.asc