How DACA works (was: Re: [daca] http://qa.debian.org/daca is 404)

To: Stefano Zacchiroli <zack@debian.org>
Cc: debian-qa@lists.debian.org
Subject: How DACA works (was: Re: [daca] http://qa.debian.org/daca is 404)
From: Raphael Geissert <geissert@debian.org>
Date: Fri, 24 Jun 2011 16:22:53 -0500
Message-id: <[🔎] 201106241622.55322.geissert@debian.org>
In-reply-to: <[🔎] 20110612163811.GB25192@msgid.df7cb.de>
References: <[🔎] 20110612121715.GA20723@upsilon.cc> <[🔎] BANLkTin20+X8mNu7_KHuDCAQtecXDv-QGw@mail.gmail.com> <[🔎] 20110612163811.GB25192@msgid.df7cb.de>

Hi,

On Sunday 12 June 2011 11:38:11 Christoph Berg wrote:
> Re: Paul Wise 2011-06-12:
> > http://qa.debian.org/~geissert/daca/
> > 
> > Perhaps it should be moved into quantz:/srv/qa.debian.org/?

Yes. However, I think it would be better if a daca UNIX group existed so that 
people don't necessarily need to be part of 'qa' to work on DACA.
(oh and a 'daca-worker' user so that the tools are not with some DD's 
priviledges.)

> 
> Definitely. Does anyone besides Raphael know how DACA works? Reading
> the web page, there's hardware somewhere powering it.

The best documentation so far is the code itself, but here's an excerpt of an 
irc conversation a while ago in #debian-qa that explains some parts of DACA. 
Note the current design is very limited and doesn't scale. I already have 
plans for a better design that is scalable, but I need time to (at least) 
document and implemen it.

<raphael> phil: for source packages there are two ways it operates (depends on 
whether it is the local mode, or distributed mode.) On the former mode, it 
generates a list of paths to .dsc files out of a Sources file and then goes one 
by one adding a lock, unpacking the source package, running the tool, storing 
result, unlocking; and again and again
<phil> Storing it where?
<raphael> on the distributed mode the master does basically the same as in the 
local mode except that it doesn't run the tool at all. It sends an HTTP query 
to a CGI on the worker host, and over and over until it reaches the max limit 
of jobs it should send. Then it sleeps a bit and queries to remote host in the 
hope that the checking is done
<raphael> the distributed worker only receives the dsc's file name and guesses 
the path to it and then unpacks, checks, stores result, and that's it
<raphael> phil: the results are stored as a file in a directory
<raphael> the same directory is used for locking the checks
<weasel> are the results large/huge?  or could they be transferred after the 
job finishes?
<raphael> say you have a dir called results/ and when locking it adds a 
$dsc_name.lock symlink to $PID, and another one to just $dsc_name. When 
storing the result it unlinks $dsc_name, replaces it, and finally it unlinks 
$dsc_name.lock
<raphael> weasel: cppcheck's and checkbashisms' are very small (in most cases 
1KB)
<raphael> but in the case of clang the output is more verbose
<weasel> how does it compare to say build logs?  way larger/about the same?
<raphael> weasel: it depends on what format we want to store the results. But 
I'd say that worst case is just like a build log
<weasel> ok, so it's still usually smallish
<phil> raphael, weasel: For added fun it needs unstable chroots for the tools, 
I presume?
<raphael> phil: if it is going to build something, yes. Tools like cppcheck 
and checkbashisms only need a mirror at hand
...
<raphael> weasel: atm the master only sends .dsc file names (and the CGI only 
stores the request.) With a few tweaks I could make it run tools other than 
cppcheck
<raphael> (the jobs are started on the worker by inoticoming)

Now, the current distributed model was designed with ravel's limitations in 
mind: it can only serve static pages, no dynamic stuff. The worker requires a 
CGI-capable httpd.

The website side of the story (ugly and hackish):
Once the results are accumulated in ravel, a cronjob in quantz downloads 
everything from ravel's results dirs which are available via ~geissert/... 
with wget -N.
Once in quantz, a makefile is executed to generate the web version of the 
reports and then the static index.html pages for every tool by calling each 
dir's index.php via HTTP.

That's as far as it gets right now (and everything should be rewritten from 
scratch.) DACA, in its current incarnation, is based on a series of scripts I 
quickly wrote to run a few of the tools and see what the results were.

HTH.

Cheers,
-- 
Raphael Geissert - Debian Developer
www.debian.org - get.debian.net

Reply to:

References:
- [daca] http://qa.debian.org/daca is 404
  - From: Stefano Zacchiroli <zack@debian.org>
- Re: [daca] http://qa.debian.org/daca is 404
  - From: Paul Wise <pabs@debian.org>
- Re: [daca] http://qa.debian.org/daca is 404
  - From: Christoph Berg <myon@debian.org>

Prev by Date: [UDD] Ubuntu upload history
Next by Date: Bug#611372: qa.debian.org: Wrong version display in new upstream todo summary
Previous by thread: Re: [daca] http://qa.debian.org/daca is 404
Next by thread: Bug#630597: qa.debian.org: rmadison lacks experimental/main/debian-installer?
Index(es):
- Date
- Thread