[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]


On Tue, Feb 28, 2023 at 11:10:24PM +0100, Philipp Kern wrote:
> On 28.02.23 20:34, Steve Langasek wrote:
> > But it's not practical to do CI -Werror builds; when we do
> > out-of-archive rebuilds for all architectures, it's a significant
> > committment of resources and each rebuild takes about a month to
> > complete (on the slowest archs).  And to be able to effectively
> > analyze build results to identify Werror-related failures with high
> > signal would require two parallel builds, one with and without the
> > flag, built against the same baseline.
> That you are so resource constrained here surprises me a little. I can see
> that for Debian, but I'm surprised that Ubuntu is affected as well.
> Especially as you'd think that this could also be done within virtualization
> - the evaluation here is mostly around running the compiler and checking its
> errors, not so much about running tests accurately on real hardware.

All Ubuntu builds are virtualized.  For most architectures that's
OpenStack VMs on real hardware of the appropriate architecture; riscv64
builds currently run in emulated VMs on x86 hardware.

I have graphs handy, so I can say that most Ubuntu architectures
complete a full rebuild much more quickly than Steve indicated.  The
last time we did this, we started four parallel full rebuilds on most
architectures, and two on riscv64.  After these were started, the build
queues cleared on amd64 in about three days, ppc64el in five,
arm64+armhf (which share builders) in about seven, and s390x in about
nine.  That's including other normal activity on the same builders at
the same time.  Some of these (especially s390x) would be much faster
except that there are some unreliabilities in the inter-build VM reset
mechanism which caused failures and meant we weren't using anything like
our full build farm capacity; usually not so much of a problem in
practice, but full rebuilds tend to involve dispatching lots of small
builds and stressing that mechanism more than usual, and also we were
running this over the end-of-year holidays when not many people were
around to babysit things.

There are definitely various ways we can improve this further, which
aren't especially on-topic for debian-devel, but nevertheless this means
that on everything except riscv64 Ubuntu can do a single full rebuild in
a couple of days (with a bit of fuzz for the small number of multi-day
builds in the archive - this is just considering how long the build
queues take to drain).

The very long pole in the tent, though, is those emulated builds on
riscv64, which did indeed take rather more than a month to clear its
build queues last time, even though it was only running two full
rebuilds rather than four.  I don't think we're going to be able to get
real hardware with the hypervisor extension particularly soon, but we
may be able to throw some more x86 hardware at it soonish to mitigate
the problem.

Colin Watson (he/him)                              [cjwatson@debian.org]

Reply to: