[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#663762: debian-policy: default for DEB_BUILD_OPTIONS=parallel=N



Hi Jakub, Russ,

On Sat, Mar 17, 2012 at 04:55:19PM -0700, Russ Allbery wrote:
> Jakub Wilk <jwilk@debian.org> writes:
> 
> > How should packages behave if there is no explicit "parallel=N" in
> > DEB_BUILD_OPTIONS? I saw two different approaches:
> 
> > 1) Behave (roughly) like if parallel=1 was set.
> 
> > 2) Be clever and try to guess the "right" level of parallelism, e.g. by
> > using "getconf _NPROCESSORS_ONLN" or parsing /proc/cpuinfo (ugh!).
> 
> > Could the Policy clarify which approach is correct? Thanks!
> 
> I think status quo is generally "do whatever upstream does by default,"
> which is slightly different than either of those.  It may be (2) if
> upstream is that smart, but it's different than (1) in that we don't
> actively suppress parallelization if upstream enables something by
> default.
> 
> My conservative inclination would be towards (1).

Speaking as someone who is upstream for a number of pieces of software
that are both packaged for Debian and built by non-Debian users in a
more 'conventional' manner -- and as someone who for quite a long time
now has been defaulting any non-trivial package that would benefit from
it to use _NPROCESSORS_ONLN (or something similar as appropriate for the
package in question) -- I was actually very interested in being the sort
of 'smart' upstream that Russ talks about, and looked at making this the
default for unpackaged builds some time ago as well ...

But it turns out, that's not actually such a smart thing for an upstream
build system to do.  The main reason that I abandoned it, despite wanting
all my users to benefit from fast build times, without wasting the multi-
core systems that most of them now own, is that it actually becomes quite
difficult to then let those users override that build system's choice of
parallel level.  There is no standard equivalent in autoconf to the D_B_O
parallel=N option.  So the only way to do that is to invent some other
non-standard thing.  Which in the simple case of ./configure && make -jN
seemed kind of backwards in terms of both convenience and common knowledge.

For packages though it's a different problem.  Having to explicitly set
D_B_O is not convenient (especially if you are throwing the package off
to a remote build machine somewhere), and knowing what a good value to
set it to for any random package you download, may not be obvious either.
Especially for packages that may now be using the "simplified" helpers,
which could permit them building in parallel if that was specified, even
if the package was not actually confirmed to be safe for that.

So my inclination is that it should be the *packagers* who make a best
guess at this (much the same way we expect them to provide a sane default
configuration when the package is installed) -- and that it be done *in*
the packages, rather than encouraging upstreams to do this instead.


There are some packages (such as the toolchain set) which also take into
account things like available memory when deciding how deeply parallel
they should run.  Familiarity with the package is the key to providing
good defaults for casual users.

So while I'd probably agree that it would be a mistake for policy to
strongly mandate that all packages should do this -- I'd also strongly
object to policy outlawing it too.  The future is filled with multi-
core processors, and clinging to a single threaded past will just make
people wonder why they wasted their money on them.

My feeling is that it is reasonable to expect the project buildds to
explicitly set parallel=N to a value that is suitable for them, and
that it is reasonable to believe that when a 'home' user wants to build
a package from source, they want that build to finish as quickly as
possibly so that they can get on with the reason they were building it
in the first place as soon as possible.  It's not like running make -j4
on a 4 core machine will actually bring it to its knees and make it
unusable for other simultaneous tasks -- any more than make -j1 does on
a single core machine.


One thing policy might usefully clarify though is the idea that the
parallel=N setting is an *upper* bound.  And that packages which know
they are going to consume a lot of memory or other resources with each
thread are free to use any number of threads *up to* N, but aren't
strictly constrained to only using either 1 or N, with nothing between
if they also have some other heuristic that may guide their choice.
I expect only a fairly small number of packages might need that extra
freedom though.


Anyhow, thanks for bringing this question up.  We discussed it at some
length on #d-d several months back and never really got to a consensus.
People fairly evenly split over whether N=1 or N=N should be the default,
and about the only thing we were all in perfect agreement about was: "if
people don't like 'my' preferred default, they can set D_B_O parallel=
for themselves explicitly to override it!".  Which works equally well
whichever side of the consensus you happened to be on :)


Not actively suppressing it if upstream does it, isn't very different
from not prohibiting DDs from using it wisely (then cluebatting the
ones that don't on a case by case basis).  Except the latter means that
we *can* override it with a standard well-known option, which wouldn't
be true if this was instead delegated to the upstream build system to
make the 'smart' part of the decision.


Best,
Ron





Reply to: