Re: Buildd backlog and testing transition.
Charles Plessy wrote:
> Le Fri, Feb 29, 2008 at 10:40:57AM +0100, Marc 'HE' Brockschmidt a écrit :
> >
> > Due to kernel problems, the mips* buildds haven't been very reliable in
> > the past few weeks, creating a loooong backlog of packages that need to
> > be built. As there seems to be a workaround for the kernel bug, this
> > should start getting better from the weekend on. As a maintainer: Just
> > wait.
>
> Dear Marc,
>
> it is good news to read that there is a solution being found. However, I
> am a bit confused because previous messages were suggesting that the
> problem was disk speed, not downtime.
The problem is a compound of
1) Not enough RAM (only 512 MB) in some machines, which causes an
increasing number of package builds to use swap, and some of them
to evenutually fail to build because of a timeout.
2) Slow on-board PIO IDE, from which the firmware can boot from
3) A kernel-imposed limit of 1 GB when PCI DMA devices (like a SATA
disk controller) is used.
4) A kernel bug in the cache coherency management which hits PIO IDE,
and causes instability since kernel 2.6.18. Up to then, the problem
was mostly papered over by an excessive amount of cache flushing in
the kernel code. This problem went unnoticed upstream since PIO IDE
is these days only used on very small/cheap systems, where a
different code path is used.
Each of those points costs a chunk of performance and makes the
buildds less reliable. The current state is:
4) I tracked this bug down (which was very hard) and wrote a kernel
patch which waits for upstream review. The code is hairy enough,
so I don't know yet if it is a proper solution or only a workaround.
That said, it works fine on my machine (which is the same model than
the buildd hardware).
3) This was supposedly fixed in kernel 2.6.22+, and works fine on the
successor model of the hardware. It still fails on the buildd
hardware, however, so the current choice is between 1GB and fast I/O
or more RAM and slow I/O. I am working on fixing this bug.
2) The obvious solution is to add SATA disks to the buildds, this is
currently in the works.
1) Upgrades to 1-2 GB RAM are also currently worked on (or already
done).
For a properly running machine of this type I expect it is capable
to build ~5% of the unstable archive per day. IOW, the current backlog
should be handled soon.
Thiemo
Reply to: