Re: make -j in Debian packages
On Thu, Jul 06, 2006 at 11:58:28PM +0200, Adam Borowski wrote:
> > > program X consist of a number of C files; it seems like compiling
> > > every file takes around 24MB,
> > Like I said, there's just too many variables. Also, I wouldn't be
> > interested in figuring out how much RAM the build takes if I were to
> > maintain a package like, say, X or so.
> No one forces you to do so. No one even said a word about making
> concurrency mandatory. It's just a way to make builds faster on
> machines that are beefy enough (ie, virtually all).
Again, I don't think it's that important to improve speed on already fast
machines, but to improve speed on those that aren't that fast.
When a *user* wants to recompile a package on his machine, he *may* want to
improve the speed as well, but I don't think that this should be a major
problem. However speeding up the buildd process on buildds would be good for
the whole project.
> > Before you've proven that this is indeed possible, I don't think
> > there's much point in this whole exercise; otherwise there
> > *is* going to be a problem with you overloading build machines, and
> > *you will* get bugs filed about that (from me, at the very least).
> Here, you got a piece of working code, is that not a good enough
> proof? Tell me how exactly a build using 4 processes using ~24MB
> each overloading a build machine which has 512+MB ram. And builds
> which do require 1+GB of memory, well, simply are not going to use
> any concurrency.
Somehow I get the impression you're over-estimating the average buildd.
This lists the average RAM per arch for those registered on buildd.net:
buildd=# select arch, count(*) as num, round(avg(ram)) as "avg RAM" from
status group by arch order by arch;
arch | num | avg RAM
----------------+-----+---------
alpha | 5 | 1178
amd64 | 3 | 2219
arm | 9 | 56
armeb | 3 | 32
hppa | 2 | 896
hurd-i386 | 1 | 248
i386 | 3 | 512
ia64 | 2 | 4096
kfreebsd-amd64 | 1 | 512
kfreebsd-i386 | 2 | 512
m32r | 1 | 128
m68k | 24 | 148
mips | 11 | 135
mipsel | 5 | 96
powerpc | 4 | 336
s390 | 4 | 256
sparc | 7 | 969
(17 rows)
When you look at such archs as arm, armeb and mipsel you'll see that your
assumption of 512 MB RAM for a buildd is a little bit off the reality. Even
the s390 buildds just have 256 MB RAM. Alpha, amd64 and ia64 are
exceptional, I think, especially ia64.
Additionally you have to consider that some buildds are also accessible by
DDs, e.g. crest.d.o, so it may happen that on those boxes DDs are compiling
their packages in parallel to the buildd, giving 3-5 parallel compilers in
total.
> And note that the system I propose actually _limits_ concurrency in
> some cases. The whole thread started with gem packages choking the
> m68k build.
In fact, gem was building just fine on that particular buildd. I know that,
because it was one of my m68k buildds and I just wondered about the parallel
makes and if that is allowed and asked Wouter about it, but otherwise the
build went fine.
> It's a big package, and the maintainer rightfully
> thinked that it is completely useless on small arches. The
When he would have thought this, he would have excluded those archs from the
arch list.
And it's not even a big package. When you look at
http://buildd.net/cgi/ptracker.cgi?unstable_pkg=gem&searchtype=m68k you can
see that even armeb built it in a reasonable time (armeb just have 32M, see
above).
> optimization he employed was to use -j4 -- something that can't hurt
> a machine on arches where the package is usable. Too bad, the
> maintainer's fault was that he didn't protect the -j4 from arches
> which can't handle it. And handling this, exceptionally rare in
> normal use, case is what we can fix.
> Even you, in the initial post of this thread, proposed a way to
> enable concurrency in some cases, so this can't be such a bad idea :p
Well, Wouter asked if usage of -jX is something mentioned in policy or such
and when it's not mentioned there, how to handle those packages.
Nobody objects that concurrency *can* improve the buildtime, but the big
question is: how can we find a way to implement this in a good way?
To achieve this, it's not that good to argue some points over and over
again. I'm still in favour of my proposal to use the existing infrastructure
(in lack of something better) and elaborate the pro and cons of one way or
the other.
When there are results, the project can decide what method to use and if it
wants to use it at all.
Any takers? My machines are quite busy at the moment and I'm in the progress
of moving, so I'm not a good candidate, but would be willing to assist in
some way or another, if I can.
--
Ciao... // Fon: 0381-2744150
Ingo \X/ SIP: 2744150@sipgate.de
gpg pubkey: http://www.juergensmann.de/ij/public_key.asc
Reply to: