[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Permanent lab: keep or drop it?



Hi,

While restructuring frontend/lintian (to implement an efficient
full-archive checker), I began to fully understand the use and intention
of the permanent lab.

The original idea of it is, that all information from packages is
collection by 'collection', put in the (permenent) lab, and then the
checks are run over it. Next time, no need to unpack the package, lab
has all info.

An important fact/thought with this is, that usually only the checks are
improved, and not the collection scripts. With out current at-most once
a month release frequency, this is usually not true.

Currently, quite some check scripts also require a full unpack to be
available. Also, a lab with these assumptions is a bit prone to errors
and bugs, they might lead to subtle differences.

Therefore, I want to propose to drop the permanent lab idea, and instead
have a cache. The cache stores results by lintian, with as key md5 of
the .dsc or .deb and the lintian version, so that a repeated query
doesn't invoke a re-run. This way, it is quite straight-forward to
implement good and efficient checking of the whole archive: md5sums are
already in Packages and Sources files, so there is efficient
cache-lookup.

What do you think of it? Permanent lab as it is currently doesn't serve
it's purpose anymore, and I don't think it's valueable to restore that:
I don't really think it's very useful to have this two-leveled access
(though it can be implemented of course), it's tricky to maintain, and
degrades performance for one-time tests (files that are needed are
copied and only then checked, etc).

And, the most important thing: it makes changing frontend/lintian in
it's current design quite hard, if you want to maintain all that
permanent lab stuff...

--Jeroen

PS: for reallly improving lintian's speed, it should be considered to
have lintian check a .deb and source archive really on the fly: read in
the (embedded or not) tar, for every file, record needed info, run the
tests on that, etc. (or, have checks register with hooks on every file
encountered). Then you fully skip the unpack stuff: extracting the
files to a temporary filesystem is the major thing making things slow...
But this is version 2 stuff :)

-- 
Jeroen van Wolffelaar
Jeroen@wolffelaar.nl (also for Jabber & MSN; ICQ: 33944357)
http://Jeroen.A-Eskwadraat.nl



Reply to: