Re: Lintian as a static analysis framework
On 2011-07-08 18:08, Russ Allbery wrote:
> Niels Thykier <niels@thykier.net> writes:
>
>> Occasionally I have found myself considering to write some tool to check
>> for something that would require (e.g.) a Packages file[1] only to stop
>> because I need to (re-)write a package extractor, a dsc or d/control
>> parser, two of Lintian's collections and then my analysis code on top of
>> that.
>> Of course with my knowledge of Lintian's internals I could still soup
>> together some code that would allow me to re-use (the parts of) Lintian
>> (I need), but as implied - it would not be pretty.
>
> Yes, it would be really nice to provide a good API to that sort of thing
> and make it available beyond Lintian. There's a lot of generic work
> that's much more broadly useful.
>
:)
>> A(nother) personal thorn in my side would be #262783; in my ideal world
>> this ought fairly simple, but in reality it is very complex. Mostly
>> because the frontend actually glues the Lab together with the
>> collections and the checks. The frontend also peeks deep into the
>> internal parts of the lab structure (to test for and mark collections as
>> finished). To me this is a sign that our backend needs more work.
>
> Currently, Lintian assumes that all relevant information is gathered into
> the lab before doing anything, and then the check scripts generally assume
> they can access the lab directly. Essentially, the lab becomes the "ABI"
> exposed to the checks. One of the goals of Lintian::Collect when I
> started writing that framework was to have it serve as the indirection
> layer used to access the lab. In other words, only Lintian::Collect and
> the parts of Lintian that generate the lab would know about the structure
> of the lab, and all check scripts would only access Lintian::Collect
> interfaces. That would allow us to swap out any lab structure without
> changing any of the checks, or support multiple different lab structures
> (such as an unbuilt source package tree).
>
So perhaps it is time to have a look at how the Lintian::Collect
migration is going. I manually checked each of our checks to estimate
how we are doing. The results are 9 are completely converted to
Lintian::Collect, 13 still uses debfiles, control or/and unpacked and
the remaining 17 accesses more than that.
I decided to group debfiles, control and unpacked using check
separately, because the use of these directories are rather common.
For the remaining checks I listed which parts of the Lab (I noticed)
they are currently accessing directly.
Does not access anything from the lab directly (9):
version-substvars, symlinks, standards-version, java, huge-usr-share,
filename-length, duplicate-files, description, circular-deps
Uses (only) the debfiles, control or/and unpacked dirs (13):
watch-file, shared-libs[3], rules[1], nmu, manpages,
info-files, files, etcfiles, debian-source-dir, debhelper,
debconf, control-file, conffiles
Uses any parts of the lab (17):
scripts (unpacked, control-scripts),
po-debconf (diffstat, debfiles)
patch-systems[1] (diffstat, unpacked, debfiles)
ocaml (ar-info)
menus[2][3] (debfiles, doc-base, control)
menu-format (menu, unpacked)
- As unpacked is present, there is little point in using
menu-files.
md5sums (control, md5sums)
- Though Lintian::Collect already offers an md5sums, but it
is for the package control file, not the one we generate
as I recall.
init.d (init.d, control)
fields (fields)
debian-readme (README.Debian)
deb-format (*-errors, deb)
cruft (*-errors, diffstat, debfiles, unpacked)
- Technically, debfiles is redundant if unpacked is present.
copyright-file (source, copyright)
- The source part can be replaced with a look in $group for
the source or the package in question.
control-files (control-index)
changes-file (changes)
- Accesses files outside the lab (as in they are not copied
or symlinked into the lab as the changes file itself)
changelog-file[3] (NEWS.Debian, changelog)
binaries (strings)
[1] Uses other parts of the lab to check they have been run in the
correct dir (e.g. something like "" -d 'fields' or die "".
- these are not listed or considered a part of the usage.
[2] Uses (requires) checks/common_data.pm
[3] Looks like it has own symlink/path resolver
A second benefit of doing a complete Lintian::Collect migration is that
our checks will no longer need the "chdir" and makes the given parts
available for cross package checks.
>> On the other hand, I know it can be difficult to settle and maintain a
>> public API. As far as I can tell, we have currently only officially
>> committed ourselves to the frontends and the profiles + the (names of)
>> the tags, checks and the collections. Everything else we can basically
>> break and smash as we much like as long as we fix it before we hit
>> release (which I admit is nice from a developer's point of view).
>
> Yes. But I think the Lintian::* packages have also been fairly stable.
> We may not be able to promise an ABI to everyone going forward
> indefinitely, but we've not had many ABI breaks in the modules we've moved
> into Lintian::*.
>
That may be true, but I am not certain that all parts of the API in the
Lintian modules are equally useful, intuitive or pleasant to work with.
I also hoped to do something about this.
>> That being said, I could see us commit to a liblintian-perl[2] that
>> would for starters provide things like Lab, Lab::Package, the
>> collections and Lintian::Collect{,::Binary,::Source,::Changes} (possibly
>> moving Lab + Lab::Package under the Lintian:: prefix first).
>
> Yeah, we should move everything into our namespace. And yes, this was one
> of the things that I was hoping to do. We could even just move everything
> in lib into it and finish fixing the namespace (there isn't that much left
> in the wrong namespace), and just be clear in the module documentation
> which pieces are considered stable and which pieces may change a lot.
>
Perhaps
>> As we mature other things (or people request it) we could migrate more
>> and more of lib/ into liblintian-perl.
>
>> As for the test suite code, I would mature it a bit more and then
>> refactor it into an external package to avoid circular
>> build-dependencies between Lintian and javatools (javahelper) if possible.
>
> That makes sense to me.
>
>> Obviously, these changes would very likely fall under the "Hard
>> projects" listed on [3], but I think we would do ourselves and the
>> Debian project a favour in the long run by doing this.
>
> Definitely.
>
> One of these days, I'll get some more free time and will be able to start
> helping again.... :)
>
I am taking you up on that offer!
~Niels
Reply to: