[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#711207: spelling-error-in-binary is overly eager: teH/the



On Wed, Jun 05, 2013 at 09:27:00AM -0700, Russ Allbery wrote:
> Ryan Kavanagh <rak@debian.org> writes:
> > Lintian is overly eager with spelling-error-in-binary. In one one
> > package, the string
> >   I9\$ teH
>
> I wonder if we should require the "word" be a minimum length to trigger
> that tag.  Maybe four or five characters?

That would work, although it would miss legitimate mispellings of words
like "the", etc, if done across the board. A more complicated example,
which may or may not be worth the additional effort, would be to check
words with three characters if and only if it is in a string with at
least two recognised words of length >= 4, e.g., neither of
    I9\$ teH
    I9\$ teH I7%53 teH %753192
would get matched because the context is gibberish, but
    I am going to hte fair
would since the context is English text (detected by the words "going"
and "fair").

This might be overkill though, and your solution of just ignoring words
of length less that 4 would probably be sufficient.

Best wishes,
Ryan

-- 
|_)|_/	Ryan Kavanagh		| Debian Developer
| \| \	http://ryanak.ca/	| GPG Key 4A11C97A


Reply to: