[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Buildd backlog and testing transition.



Charles Plessy wrote:
> Le Fri, Feb 29, 2008 at 10:40:57AM +0100, Marc 'HE' Brockschmidt a écrit :
> > 
> > Due to kernel problems, the mips* buildds haven't been very reliable in
> > the past few weeks, creating a loooong backlog of packages that need to
> > be built. As there seems to be a workaround for the kernel bug, this
> > should start getting better from the weekend on. As a maintainer: Just
> > wait.
> 
> Dear Marc,
> 
> it is good news to read that there is a solution being found. However, I
> am a bit confused because previous messages were suggesting that the
> problem was disk speed, not downtime.

The problem is a compound of
1) Not enough RAM (only 512 MB) in some machines, which causes an
   increasing number of package builds to use swap, and some of them
   to evenutually fail to build because of a timeout.
2) Slow on-board PIO IDE, from which the firmware can boot from
3) A kernel-imposed limit of 1 GB when PCI DMA devices (like a SATA
   disk controller) is used.
4) A kernel bug in the cache coherency management which hits PIO IDE,
   and causes instability since kernel 2.6.18. Up to then, the problem
   was mostly papered over by an excessive amount of cache flushing in
   the kernel code. This problem went unnoticed upstream since PIO IDE
   is these days only used on very small/cheap systems, where a
   different code path is used.

Each of those points costs a chunk of performance and makes the
buildds less reliable. The current state is:

4) I tracked this bug down (which was very hard) and wrote a kernel
   patch which waits for upstream review. The code is hairy enough,
   so I don't know yet if it is a proper solution or only a workaround.
   That said, it works fine on my machine (which is the same model than
   the buildd hardware).

3) This was supposedly fixed in kernel 2.6.22+, and works fine on the
   successor model of the hardware. It still fails on the buildd
   hardware, however, so the current choice is between 1GB and fast I/O
   or more RAM and slow I/O. I am working on fixing this bug.

2) The obvious solution is to add SATA disks to the buildds, this is
   currently in the works.

1) Upgrades to 1-2 GB RAM are also currently worked on (or already
   done).

For a properly running machine of this type I expect it is capable
to build ~5% of the unstable archive per day. IOW, the current backlog
should be handled soon.


Thiemo


Reply to: