Lintian tag classification (was: Bits from the Lintian maintainers)

To: Jordà Polo <jorda@ettin.org>
Cc: debian-lint-maint@lists.debian.org
Subject: Lintian tag classification (was: Bits from the Lintian maintainers)
From: Russ Allbery <rra@debian.org>
Date: Fri, 28 Mar 2008 16:36:24 -0700
Message-id: <[🔎] 87zlsiwhdz.fsf_-_@windlord.stanford.edu>
In-reply-to: <[🔎] 20080323133005.GA12459@hinnom> ("Jordà Polo"'s message of "Sun\, 23 Mar 2008 14\:30\:05 +0100")
References: <87iqzppzhc.fsf@windlord.stanford.edu> <[🔎] 20080323133005.GA12459@hinnom>

Jordà Polo <jorda@ettin.org> writes:

> I'm thinking of working on it as suggested by Marc Brockschmidt's
> proposal[1] for Google's Summer of Code.

I think that would be fantastic!  I'm sorry about the delay in responding;
I've been really busy with my day job for the past couple of weeks.

> The goals are clear and I don't think there is a lot of room for
> creativity, but I still would like to know your thoughts about how it
> should be implemented.
>
> Basically, my initial idea is to make it possible to use a
> comma-separated list of keywords in Type:, instead of using «error»,
> «warning» or «info» only. (Keywords may include a namespace as in
> «severity::error» or «certainty::wild-guess», depending on how the final
> classification looks like.)
>
> This way it would be easy to include more information later if needed
> (such as «origin::policy», etc.). But does it make sense, or you think
> this breaks the purpose of Type: and new headers must be created for
> each category?

I think that you will find it easier to do the work incrementally if you
keep Type as-is and add a new header containing this information.  We can
then run Lintian in either mode and compare the results for a while before
retiring the old classification system and rebasing it on the new tags.
Otherwise, it becomes a single flag-day change, and those are always
harder to pull off.

Otherwise, this is more general than I had been planning but still seems
fundamentally sound.  I think it's worth considering, given that we
already have a field syntax where it's very easy to add more fields,
whether it makes sense to use keywords instead of just adding more
fields.  In other words, rather than having:

    Type: severity::error, certainty::wild-guess, origin::policy

you could have:

    Severity: error
    Certainty: wild-guess
    Origin: policy

and use more of the existing parsing infrastructure without having to
handle the keywords separately.

Taking a step back from the specific details, I see this work as having
three basic steps.  This is just my view on it, though, and you may want
to do it differently.

* Design and implement a new output format based on this classification
  rather than the existing E/W/I classification, along with the code to
  read the new classification information from the checks/*.desc files and
  maintain the internal data structures.  There should be new options to
  specify the severity, certainty, and origins that one is interested in
  seeing and the output should be filtered accordingly.  Lintian's exit
  status may also require some attention.

* Implement a mapping from the new classifications to the old E/W/I so
  that we can retire the existing Type while maintaining a
  backward-compatible interface.

* Go through each tag and classify it according to the new system.  This
  is the part that requires the most tedious effort, but the work can be
  shared and can be done incrementally as long as the new code can handle
  tags that have not yet been classified in the new system.

-- 
Russ Allbery (rra@debian.org)               <http://www.eyrie.org/~eagle/>

Reply to:

Follow-Ups:
- Re: Lintian tag classification (was: Bits from the Lintian maintainers)
  - From: Jordà Polo <jorda@ettin.org>

References:
- Re: Bits from the Lintian maintainers
  - From: Jordà Polo <jorda@ettin.org>

Prev by Date: Bug#473156: [checks/scripts] dash now supports $((cnt+1))
Next by Date: Re: Lintian tag classification (was: Bits from the Lintian maintainers)
Previous by thread: Re: Bits from the Lintian maintainers
Next by thread: Re: Lintian tag classification (was: Bits from the Lintian maintainers)
Index(es):
- Date
- Thread