[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: GSoC status: classification, output format and more



Jordà Polo <jorda@ettin.org> writes:
> On Sun, Jul 20, 2008 at 09:13:25PM -0700, Russ Allbery wrote:

>> The plus is that the basic format uses the same terms that people are
>> already familiar with, even though we also have support for tuning the
>> output for things like ftp-master.  The drawback is that we're not
>> pushing people towards the new, granular way of thinking about tag
>> severity.  But I'm not sure that's necessary.
>
> Oh, I thought the idea was to change E/W/I too. I'm OK with it, but I
> still think alternative outputs could be interesting.

Agreed.  Basically, I'm good either way on a solution here, or we can even
implement a few different ones and let people experiment or vote or
something.

> And while I agree there is no easy way to represent the new information
> on the command line interface, I'm sure we can find more subtle ways to
> display it on lintian.d.o.

Yes.

> Btw, I didn't say much about Source:, but that's because I was thinking
> of reusing Ref: which already has the relevant information. Though some
> standardization wouldn't hurt: using the document ID as defined by
> doc-base instead of its title (and optional debian- prefixes?), removing
> the word "section", and making it a comma separated list. So instead of:
>
>   Ref: policy 3.9.1
>   Ref: menu manual 3.7
>   Ref: Perl policy 4.4.2
>   Ref: Debian doc-base Manual sections 2.3.2.1 and 2.3.2.2
>   Ref: debconf-devel(7)
>
> the standardized entries would be:
>
>   Ref: policy 3.9.1
>   Ref: menu 3.7
>   Ref: perl-policy 4.4.2
>   Ref: doc-base 2.3.2.1, doc-base 2.3.2.2
>   Ref: debconf-devel(7)

Yes, standardization would be excellent here, as well as adding more
keywords to the translator that turns them into nice descriptions for the
web and for -i output.

The one thing this doesn't give us is distinguishing between the "sources"
of the various tags that don't have meaningful Ref values.  There are a
few different cases even if the tag isn't based on some external source.
"The resulting package would be broken" vs. "request of relevant
maintainer" vs. "generally accepted best practice" comes to mind.  But we
could handle this through keywords in Ref.

> Makes sense, I have changed it already. I also updated the script to get
> some numbers[1], and with all the tags that have been classified so far
> (~62%), the "accuracy" of this mapping is ~95%.
>
>  1. http://ettin.org/tmp/lintian/transtats.out

Given the high accuracy, it might be nice to put a summary at the end of
this output listing the tags where the classifications don't match and the
new and old classifications.  I'm betting most of them are just bugs in
the current Lintian.

-- 
Russ Allbery (rra@debian.org)               <http://www.eyrie.org/~eagle/>


Reply to: