Redesigning the autopkgtest controller/workers CI system for Ubuntu and Debian
Antonio: Developer of http://ci.debian.net/ on the Debian side
Evan, Andy, Vincent: Ubuntu Continuous Integration team
Jean-Baptiste, Martin: Ubuntu QA team, autopkgtest maintainers (tests
I've recently talked to you about redesigning our system to run
autopkgtests. Contacting you via PM as most of you aren't on
autopkgtest-devel@ , but if you are interested in the topic please
consider subscribing. It's very low-traffic.
Status quo in Ubuntu
We have used a set of scripts around britney  for getting a request
like "please test this set of packages in -proposed" from britney to
Jenkins. Jenkins then starts these jobs on a set of manually
configured jenkins slaves (worker machines), which run the tests and
send the log files back to Jenkins. The britney scripts then copy
those around, munge them into a common history, and do the go/no-go
This involves the maintenance of two Jenkins instances, which is quite
involved (it falls over very often, needs a lot of complicated
configuration, manual intervention, has lots of dependencies, eternal
trouble with plugins, etc.). As it stands, this is a rather
significant single point of failure.
It also involves rsyncing state files between multiple hosts (which
also fails at times), and generally doesn't scale easily (as each new
runner, or adding new architectures involves reconfiguring jobs).
For actually running the tests on x86 we currently use the
lp:auto-package-testing scripts, which run QEMU, and autopkgtest
inside with the adt-virt-null runner. This is unfortunately rather
simplistic, isn't able to run test cases with "breaks-testbed" and
doesn't provide a minimal environment. It also means that the actual
VMs are nontrivial to set up and use, as you have to run adt-run
inside them and care about copying the logs back and forth, etc.
For displaying test results and logs/artifacts to developers we just
use the Jenkins UI. This is reasonably functional, although a bit too
complex and sometimes confusing for quickly finding the actual logs
you are interested in. But by now people should have figured it out,
so this is not really urgent.
Status quo in Debian
http://ci.debian.net is still fairly young, and at the moment pretty
much an one-man show from Antonio. At this point I want to say a big
thank you to Antonio for his great work there! It's really nice to get
regular autopkgtest running into Debian itself, and thus get more
attention to it by the developers (e. g. it's shown in DDPO now). It
is also a goal in Debian to eventually gate unstable? testing with
Antonio, correct me if I'm wrong, but the current debci setup is
rather simple: Everything runs on just one machine, just for amd64,
and this uses adt-virt-schroot for everything. So this doesn't scale,
and also doesn't extend to other architectures.
debci has a lean homegrown UI which makes it straightforward to get to
logs of individual package versions. It also provides machine-readable
json files for everything. It'll need to be extended to
cover multiple architectures at least, but that doesn't seem too
Suggested new architecture
So this obviously needs some cleanup and robustification. We want to
be able to scale effortlessly in both the numbers of workers we have
(even dynamically add and remove them according to the current load),
as well as support new architectures to run tests on. We also want to
robustify the communication between britney, the autopkgtest
controller, and the test execution workers.
So our CI team proposed to build upon the "new world technologies"
OpenStack's swift (for distributed/redundant network data storage),
and RabbitMQ (for distributed task queue management). Both of these
technologies were new to me, but I played around with them a bit and I
am convinced that these are much more robust, leaner, and easier to
use than what we have now.
rabbitmq-server is delightfully simple to set up (apt-get install,
that's it), and I recently figured out how to locally set up swift
. rabbitmq would replace the state file rsyncing and Jenkins' job
control, as well has having to run jenkins-slave on the workers (which
is quite heavy in terms of dependencies), workers would store the
logs/artifacts into swift, and the web UI would read and present these
So we could have the following new architecture:
* One host for the "controller" which runs rabbitmq-server. This can
(but doesn't need to be) the same server that britney is already
britney sends a "test test mypkg_1.1" request to the
autopkgtest_amd64/autopkgtest_i386/autopkgtest_armhf etc. rabbit
* A dynamic set of worker nodes which do the test execution. They
read from the autopkgtest_* queue which they are capable of
processing, run the test, and store the results in swift. This
should have a predictable directory hierarchy, like
/autopkgtest/trusty/armhf/foopkg/1.2.3-4/, so that we can avoid
having to send back a result pointer.
* A swift installation, providing sufficient storage space and
redundancy. We already have one for CI/QA in Ubuntu, and we'll need
to set up one for Debian (that's the only bit that actually
requires some thought and knowledge).
* Each time britney runs, it checks whether there's a result for the
package it requested a test for in swift. That's much better than
reading a "results" rabbit queue, as it is resilient against race
failures/interruptions in britney (i. e. you can read test results
not just once) and generally plays better with the stateless
architecture of britney.
* We extend debci (i. e. http://ci.debian.net) for multiple
architectures and perhaps other missing things, and move to that as
a developer-facing frontend for showing artifacts and results. For
the Ubuntu CI dashboard etc. we could just read the files from
swift directly, or read the aggregated debci .json files.
I have some throwaway scripts  and 3 containers (swift, adt
controller, adt slave) to evaluate how rabbitmq and swift work and how
to use them from Python to glue all the components together. This is
just for learning, but it shows that these APIs are quite pleasant and
simple, and at the same time robust.
I'd like to know from all of you what you think about this redesign,
whether you think it's sound, whether you already thought about/worked
on this problem, and what is missing from this.
I'm quite happy to work on the implementation (now that the basic
building blocks are all there this actually shouldn't take long), but
I'd really like us to get to a common agreement how to design this for
Debian and Ubuntu.
If you think it's helpful, we can also organize a Google Hangout and
talk face to face sometime soon?
Thanks in advance!
 That's the gatekeeper for packages in -proposed; it checks that it
builds everywhere, is installable, and that the package's and its
reverse dependencies' autopkgtests are all fine, and then promotes the
package to the release. For Debian, replacae -proposed with "unstable"
and release with "testing".
Martin Pitt | http://www.piware.de
Ubuntu Developer (www.ubuntu.com) | Debian Developer (www.debian.org)
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 819 bytes
Desc: Digital signature