Lintian test suite proposal

To: felix.lechner@lease-up.com
Cc: debian-lint-maint@lists.debian.org
Subject: Lintian test suite proposal
From: Niels Thykier <niels@thykier.net>
Date: Fri, 10 Aug 2018 07:28:00 +0000
Message-id: <[🔎] aa07c5a1-ba23-f930-ad37-6d26f6e5d1d5@thykier.net>
Hi Felix,

Context for lint-maint: Felix Lechner sent me a proposed change for the
test suite and asked me for write an email with the feedback.  I thought
it was useful to include lint-maint because it affects all contributors
of lintian (if implemented).  The proposal is quoted below interleaved
with my comments and is to be an update/edit to t/tests/README.


Thanks for working on improving Lintian.  I should once again repeat
that my work on lintian has dwindled and I may not be the right
contributor any longer to decide which way we are going with the test
suite (IOW, take my review with a grain of salt).


As I read it, there are two orthogonal but interesting ideas.

 1. The tags.d layout

 2. The Abbreviated-Tags proposal pigging backed on top of tags.d.

As I read it, these proposals should be feasible implement separately
(i.e. Abbreviated-Tags could be retrofitted into the existing framework
without adding tags.d).  This is mostly relevant if one of the proposals
turn out to be controversial and the other is not - in that case, we can
move forward with the uncontroversial one.


Personally, I feel I understand Abbreviated-Tags good enough that I can
say I would support it and I would recommend it was made the default for
new tests (modulo a few minor remarks).
  On the tags.d layout, I can mostly see where you would like to go but
it is not clear to me when the test runner will accept vs. fail a test.


> New tags.d layout
> -----------------

General remarks about the text (assuming it is for t/tests/README):

 * I think we should avoid the "New" / "Old" in this document if it is
   an addition to t/tests/README.  The problem with using "new" for
   everything is that in a year it will no longer be new and at some
   point it could even be superseded by a different format/layout as we
   improve ourselves.

 * The text sometimes stops being a README document and starts being
   sales-pitch and then jumps back to being a README a bit later only
   to become a TODO list in the end.  I have highlighted a few of these
   but I stopped when I realised it was a general problem.
   - Note there is nothing wrong with the sales-pitch/rationale for the
     proposals.  The "issue" is that I was told to read this as a
     t/tests/README change and some of the text need tweaking to fit
     there.

> 
> In addition to the established format described in the remainder of
> this document, our test suite now supports a new directory-based
> format for expected tags.
> 
> The new format no longer compares the entire lintian output. Instead,
> it uses separate files for each lintian check. The files are located
> in a directory called tags.d.
>> Each file is named after a lintian check. It contains all tags issued
> by a particular check. The tags can be in any order. Tags from checks
> are ignored and will no longer interfere.

The last sentence is a bit unclear to me.  I suspect you mean that tags
are ignored if they do not belong to one of the checks in tags.d?

Assuming there is tags.d/X file and it contains the tag "Y".  Will the
test fail if the check X emits the tag "Z" in addition to "Y"?

> This decoupling has several
> advantages. For example, one can now enable '--pedantic' without
> having to deal with any noise from other checks. Instead, one can
> conveniently test just the tags expected from a particular check.
> 
> The tags.d directory also contains a file named 'options' which gives
> the user a way to restore the old behavior. 

How come you are going for a new "options" file for these options
instead of adding them to the per-test desc file?

> When set, 'Exhaustive-Set:
> yes" will cause any extra tags in the output to fail the test. Since
> checks generally have little correlation, this option will not be
> useful in many cases. For most tests, the default setting of
> 'Exhaustive-Set: no' will be a great relief.

I could do with some examples of when the test succeeds and fails given
a given tags.d folder/content, different Exhaustive-Set values and
different lintian output.

> The lintian option
> '--pedantic' might even become the default setting for the test suite
> going forward.
> 

The last sentence (about making --pedantic default in tests) makes sense
in the proposal as it is part of the reason why this proposal might be
useful.  However, I do not think it would belong in t/tests/README.

> For more flexibility, there is another option 'Abbreviated-Tags: yes'.
> It allows expected tags to be listed in a shorter and possibly
> more convenient format. First, the letter code at the beginning of
> each line is dropped and reconstructed automatically. Tags are only
> ever issued with one of these letters, so the chance of an internal
> error is small. Instead of worrying about a tag's severity and
> certainty, you can focus on which tags are being issued and why.
> 

Note that AFAICT, the Abbreviated-Tags could be added to the existing
framework without the new layout (after resolving the feature
interaction with e.g. "Sort: yes").

> An example for the abbreviated format is:
> 
> - File: tags.d/changelog-file.tags
> 
> : epoch-changed-but-upstream-version-did-not-go-backwards 3 >= 0.0.1
> : no-upstream-changelog
> : epoch-change-without-comment (none) -> 2
> 
> Please note the colons at the beginning of each line. They are necessary
> to distinguish plain lintian tags from other suites, as in:
> 
> - File: tags.d/watch-file.tags
> 
> source: debian-watch-does-not-check-gpg-signature
> 

Would it make sense to also let the test runner reconstruct the
"source:"-bit (etc.) such that the abbreviated format will be:

"""
<tag1>
<tag2> <extra>
<tag3>
"""

At least, it seems more natural to me that if the test runner is going
to assist us, it might as well assist us with all the "boring" parts of
the output.

> The changes in the new format work together to dissociate checks, tags
> and tests. One goal is to make everything easier to refactor in the
> future.
> 
> There are many other benefits, as well: With the new format, we
> automatically confirm for each check that no tags other than those
> specified were issued. That obsoletes the option Test-Against.
> 

Test-Against is an interesting relic.  Test-Against has 3 purposes:

 1) What you describe, where it is now completely and utterly obsolete
    because the test would fail before Test-Against is validated.
    (unless you consider it a "stop gap" to avoid people blindly shoving
     a false-positive regression in /tags)

 2) For calculating t/COVERAGE which you mention later

 3) For selecting all tests that are relevant to a given tag.

I believe the proposal covers 1+2.  But I do not see 3 mentioned in the
proposal.  I am fine with dropping Test-Against if item 3 is no longer a
useful feature but I figured we ought to be explicit about it.

> When a check is expected to run without issuing tags, please use a
> file with one or more empty lines. (Git won't commit files that are
> totally empty.)

Note: We have plenty of completely empty files committed in git; what
git refuses to track are empty directories.

> Since we now know the names of all checks in a test,
> this also obsoletes the option Test-For.
> 

True, but this could also be retrofitted to the current solution with
Test-For being optional when it is 1:1 with the tags file.  That would
remove the vast majority of Test-For entries.

Also, I would recommend avoid using first person ("we" or "I") in the
README.  It is fine for an informal description or a proposal.

> Neither Test-For or Test-Against is needed going forward. They are two
> fewer things to worry about. Both fields were previously used to
> calculate the COVERAGE. For tests in the new format, the contribution
> to coverage is calculated from other data. With fewer opportunities
> for human error (for example by forgetting to update Test-For), the
> COVERAGE information will be at least as accurate as before. The
> coverage may also be wider due to the automatic Test-Against, as
> explained above. In any event, the COVERAGE information continues to
> be useful and up-to-date.
> 

I like this aspect of the proposal aiming to reduce redundant
information.  :)

> Now you may wonder, does it have any drawbacks? The answer is
> yes.

A bit to informal / speech-like for the README.

> Since unexpected tags are generally ignored, it can be hard to
> tell what a test could also be good for. It may be best practice to
> use only narrow tests, but you can get additional recommendations for
> wider unit testing by running 'runtests' in verbose mode. It will tell
> you which other checks emit tags, so you can add them if you like.
> 
> As a special rule, 'runtests' will expect absolutely zero lintian
> output when no tag files are present. In that case, the option
> 'Exhaustive-Set' has no effect. This eliminates the absurd and
> unwanted possibility that no testing takes place. The resulting error
> messages also make it easier to write unit tests for particular
> features, one test at a time.
> 
> The features described presently only work with regular lintian
> output. For XML or colon-separated output, please continue to use the
> regular, uniform 'tags' files.
> 
> There are few other changes: The older format used a script
> 'test_calibration' to adjust expected tags for corner cases (mainly
> for release architectures). Such scripts now operate on a per-check
> basis. They have names like tags.d/${check}.calibrate. Please note
> that the input and output formats are the same as for the expected
> tags. If you use abbreviated tags in your test, please make sure your
> calibration script issues them, too. The scripts get only a filtered
> lintian output that relates to their particular check.
> 
> A few tests examine lintian's debug or non-tag output. Those lines
> are now found in tags.d/messages. You can use a calibration script
> with them, but there is no abbreviated format.
> 

How is lintian's message output matched with tags.d/messages?  Are both
sorted first or ...?

> Since the expected tags can now be listed in any order, the Sort
> option is probably obsolete, unless a calibration script depends on
> it.
> 

I think the Abbreviated-Tags proposal will overshadow the Sort option
for tags.  As such, Sort will generally only be useful for
"Abbreviated-Tags: no" tests (but I would not be surprised if most of
them were also "Sort: no" tests):

I would not keep Sort around for calibration scripts; it is trivial to
invoke sort in such a script.

> The majority of tests were converted automatically. Tags previously
> listed in one large 'tags' file are now separated into several smaller
> files that group tags emitted by the same check. For tags found in
> Test-Against, we made sure the check they belong is represented in the
> tags.d directory. Sometimes, that meant adding a blank file. (For all
> the automation, it should be noted that all tags that were listed in
> Test-For were being tested; Test-Against cannot be similarly
> validated.) Some tests used special features that will require more
> work. They were not converted.
> 
> Going forward, there are many things you can do to make our test suite
> better. First of all, it is always a good idea to write more
> tests. While we already have many, one can rarely have enough. (In the
> future, we may group them somehow to make them easier to navigate.)
> For converted tests, please embrace the unit testing paradigm and help
> with the following:
> 
> 1. Remove 'Exhaustive-Set: yes' from tags.d/options. (The default is
> no.)
> 
> 2. Delete any tag files similar to tags.d/{$check}.tags that have only
> nuisance tags. Those are tags that appeared from unrelated checks you
> were not actually interested in testing. (At some point, lintian may
> also speed things up by skipping checks not present.) Please remove
> any file your test does not mean to validate. If two tag files appear
> useful separately, please consider duplicating the test, and place one
> file with each.
> 
> 3. If you have the patience to add missing tags, please enable
> '--pedantic'.
> 
> Now we have unit testing! Thank you!

Once again, thanks for working on improving lintian.  As I mentioned, I
am happy to support the Abbreviated-Tags feature with only a minor
remark on the syntax (possibly retrofitted into the existing layout as
well).
  For tags.d, I need some more examples to understand how the parts
interact.

Thanks,
~Niels
Reply to:
Follow-Ups:
- Re: Lintian test suite proposal
  - From: Felix Lechner <felix.lechner@lease-up.com>
Prev by Date: lintian_2.5.96_amd64.changes ACCEPTED into unstable
Next by Date: Bug#905843: lintian recognized "allow-stderr" as a non-standard autopkgtest restriction
Previous by thread: lintian_2.5.96_amd64.changes ACCEPTED into unstable
Next by thread: Re: Lintian test suite proposal
Index(es):
- Date
- Thread