How DACA works (was: Re: [daca] http://qa.debian.org/daca is 404)
Hi,
On Sunday 12 June 2011 11:38:11 Christoph Berg wrote:
> Re: Paul Wise 2011-06-12:
> > http://qa.debian.org/~geissert/daca/
> >
> > Perhaps it should be moved into quantz:/srv/qa.debian.org/?
Yes. However, I think it would be better if a daca UNIX group existed so that
people don't necessarily need to be part of 'qa' to work on DACA.
(oh and a 'daca-worker' user so that the tools are not with some DD's
priviledges.)
>
> Definitely. Does anyone besides Raphael know how DACA works? Reading
> the web page, there's hardware somewhere powering it.
The best documentation so far is the code itself, but here's an excerpt of an
irc conversation a while ago in #debian-qa that explains some parts of DACA.
Note the current design is very limited and doesn't scale. I already have
plans for a better design that is scalable, but I need time to (at least)
document and implemen it.
<raphael> phil: for source packages there are two ways it operates (depends on
whether it is the local mode, or distributed mode.) On the former mode, it
generates a list of paths to .dsc files out of a Sources file and then goes one
by one adding a lock, unpacking the source package, running the tool, storing
result, unlocking; and again and again
<phil> Storing it where?
<raphael> on the distributed mode the master does basically the same as in the
local mode except that it doesn't run the tool at all. It sends an HTTP query
to a CGI on the worker host, and over and over until it reaches the max limit
of jobs it should send. Then it sleeps a bit and queries to remote host in the
hope that the checking is done
<raphael> the distributed worker only receives the dsc's file name and guesses
the path to it and then unpacks, checks, stores result, and that's it
<raphael> phil: the results are stored as a file in a directory
<raphael> the same directory is used for locking the checks
<weasel> are the results large/huge? or could they be transferred after the
job finishes?
<raphael> say you have a dir called results/ and when locking it adds a
$dsc_name.lock symlink to $PID, and another one to just $dsc_name. When
storing the result it unlinks $dsc_name, replaces it, and finally it unlinks
$dsc_name.lock
<raphael> weasel: cppcheck's and checkbashisms' are very small (in most cases
1KB)
<raphael> but in the case of clang the output is more verbose
<weasel> how does it compare to say build logs? way larger/about the same?
<raphael> weasel: it depends on what format we want to store the results. But
I'd say that worst case is just like a build log
<weasel> ok, so it's still usually smallish
<phil> raphael, weasel: For added fun it needs unstable chroots for the tools,
I presume?
<raphael> phil: if it is going to build something, yes. Tools like cppcheck
and checkbashisms only need a mirror at hand
...
<raphael> weasel: atm the master only sends .dsc file names (and the CGI only
stores the request.) With a few tweaks I could make it run tools other than
cppcheck
<raphael> (the jobs are started on the worker by inoticoming)
Now, the current distributed model was designed with ravel's limitations in
mind: it can only serve static pages, no dynamic stuff. The worker requires a
CGI-capable httpd.
The website side of the story (ugly and hackish):
Once the results are accumulated in ravel, a cronjob in quantz downloads
everything from ravel's results dirs which are available via ~geissert/...
with wget -N.
Once in quantz, a makefile is executed to generate the web version of the
reports and then the static index.html pages for every tool by calling each
dir's index.php via HTTP.
That's as far as it gets right now (and everything should be rewritten from
scratch.) DACA, in its current incarnation, is based on a series of scripts I
quickly wrote to run a few of the tools and see what the results were.
HTH.
Cheers,
--
Raphael Geissert - Debian Developer
www.debian.org - get.debian.net
Reply to: