[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: make -j in Debian packages



On Fri, Jun 30, 2006 at 08:41:33AM +0200, Ingo Juergensmann wrote:
> On Fri, Jun 30, 2006 at 03:22:48AM +0200, Goswin von Brederlow wrote:
> > The same can't be said for upstream makefiles though. Many sources
> > don't build with -j option.

Right, that's just what I said :p  It's the upstream and the
maintainer who know whether -j is safe or not -- the end-user or
buildd are not expected to guess this.  What the end-user and buildd
are supposed to have to say about is whether they want a fast or a
low-mem build.

> > I'm not sure if debian/rules should somehow enforce -j1 in those
> > cases

That would be opt-out, and everyone here seems to be against it in
this place.

Just let's not confuse maintainer's opt-foo and the builder's opt-foo.

> > or if only packages that benefit from -jX should add support for
> > some DEB_BUILD_OPTION or CONCURENCY_LEVEL env var. The later
> > would always be the save option. The former would have lots of
> > hidden bugs that prop up suddenly on rebuilds or security fixes.

Hell yeah.

> I'm strictly in favour of being on the safe side. If maintainers
> would start now to add -j4 to their makefiles, we would end in more
> FTBFS bugs,

It's them who are supposed to have a clue whether their packages can
handle -j or not.  If they are wrong, well, unlike typical SMP
issues, this is actually a bug which is pretty likely to pop up in
the first build.  This means before the upload is actually done, or,
for abysmal maintainers, when the faster autobuilders get the package.


> When we want to introduce some sort of concurrency builds, we
> should do it that way that the impact is as small as possible
> (opt-in). Start with some packages that are known to build fine
> with -jX and choose 1-2 buildds on some archs to implement this
> feature for a small test.

Good idea.

That's why I started with a small but not tiny (1MB of sources)
package of very little importance.  It appears to build fine, and
when I built it on a box with 24MB of memory, the machine didn't
croak.

So, here: take either this test package (kbtin), or, even better,
take the snippet between the marked (####################) lines in
debian/rules and apply it to a bigger package.  I claim that it will
handle both small and big machines well by default, and obey
CONCURRENCY_LEVEL if it's set.  

This is not "intelligent design", there is a clear way to prove my
claims false.

> > Maybe someone should do a complete archive rebuild with -j1 and -j4 on
> > a smp system and compare the amount of failures to get an overview how
> > bad it would be.

Ugh, wrong.  A "complete archive rebuild" won't do anything if the
change is done in debian/rules, and it is known to be a bad idea for
any non-SMP-clean package already.

We're talking about allowing the maintainers to enable their packages
to make use of concurrency if they believe that:
1. their packages are SMP-safe
2. the amount of memory taken won't exceed like 128-256MB.

> > > Thus, my counter-proposal:
> > > Let's allow maintainers to use make -jX according to their common
> > > sense, requiring obeying an env variable to opt out.
> > and let the maintainer pass that env variable down a -jX.
> > How about this option:
> > We write a tool "concurency-helper" that gets a set of requirements of
> > the source and outputs a suitable concurency level for current build
> > host. Requirements could include:
> > --package pkg
> >    Let the tool know what we build. The tool can have overrides in
> >    place for packages in case special rules apply.
> > --max-concurency X || --non-concurent
> >    Limit the concurency, possibly to 1, for sources that have problems
> >    with it. Although such sources probably just shouldn't support this.
> > --ram-estimate X [Y]
> >    Give some indication of ram usage. If the host has too little ram
> >    the concurency will be tuned down to prevent swapping. A broad
> >    indication +- a factor of 2 is probably sufficient. The [Y] would
> >    be to indicate ram usage for 32bit and 64bit archs seperately.
> >    Given the pointer size ram usage can vary between them a lot.
> > --more-concurent
> >    Indicate that there are lots of small files that greatly benefit
> >    from interleaving I/O with cpu time. Try to use more concurency
> >    than cpus.
> > The tool would look at those indicators and the hosts resources in
> > both ram and cpus and figure out a suitable concurency level for the
> > package from there.
> > What do you think?
> 
> Of course this is a more complex approach, but I think it's an approach into
> the right direction. I would like to have settings for distcc to get added
> (like --use-distcc and --distcc-hosts)

A good idea.  My proposal was to keep it simple, at the cost of being
too cowardly when distcc is in use.

If the helper is stored in a common place, this would also handle
duplication of code.

> > > Rationale:
> > > Nearly every buildd and nearly every user building the packages on
> > > his own will benefit from -j2 [2], even on non-SMP.  Unless it's a
> > > piece of heavily-templated code, any modern box will have enough
> > > memory to handle it.  The maintainer know whether the code is heavily
> > > templated or not.
> > Mips, mipsel, arm and m68k won't benefit. The ram requirement just
> > leads to poor cache performance or even excessive swapping in
> > general. Sources and gcc are growing and growing and the ram of the
> > buildds stays the same.
> > On the other hand any modern system will build the source fast and the
> > buildd will be idle most of the time so -j2 or not hardly matters.
> > Wouldn't that indicate a preference to -j1?
> 
> Erm. Most modern systems are not in the need to necessarily speed up the
> build process because there are fast enough to keep up easily anyway. 
> Using parallel makes on these systems are just nice to have for a user who
> eventually is rebuilding a package on his/her own system (and of course the
> package maintainer itself). 
> The slower archs would greatly benefit of any speed increase that can be
> achieved - but of course it should end in slower builds, because there
> resources are limited on those archs. Therefore I would like to see the
> possibility added to use distcc on those archs. 

Heh.  Ingo actually arguing my way -- cool!  I completely forgot that
distcc is a form of concurrency, a form that is not limited to a
single SMP box.


Anyway, "gem" was using concurrency in a way that not everyone was
happy with.  What I would want is having a common snippet that can be
put in debian/rules to speed up builds without harming small systems.

Meep?
-- 
1KB		// Microsoft corollary to Hanlon's razor:
		//	Never attribute to stupidity what can be
		//	adequately explained by malice.



Reply to: