[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#514951: [checks/binaries] check the output of strings for typos and mistakes



Russ Allbery wrote:

> Raphael Geissert writes:
> 
>> Attached mbox contains patches adding a collection script to gather the
>> output of the 'strings' command on the ELF binaries of a package, and
>> spell checks its output. The collection script is also needed to
>> implement the check for statically linking to zlib, within others.
> 
> You have to run strings -a if you're going to implement that, at which
> point I think chances are pretty high that you're going to get false
> positives from the spell checking part.

Not true; I've already tried without -a and successfully matched the zlib
version string.

> 
> I don't really like trying to spell-check the strings inside binaries.
> It's an interesting idea, but I think it's highly prone to false
> positives.  

I checked many packages and didn't find any false positive. In any case, it
could be implemented as an experimental check.
By the way, pusling mentioned on IRC that we should take care of telling the
maintainer how to correctly fix the mistakes without fuzzing the
translations. For this all is needed is fix the mistakes in the msgid's of
the .po files as well.

> We're currently assuming that everything we're spell-checking 
> is English text.  Binaries may contain different hyphenation or
> non-English words that happen to match one of our corrections.
> 

Again, I don't think we would encounter any of those cases. And remember
that the spell checking function does strip non-ASCII characters, so
foo-bar would end up being foobar, which won't match.

Bit OT: with the current spell checking method, multi-words cases will never
match because every word of the text is extacted and compared. 

>> The elf-index file was originally being generated so that it could be
>> possible to iterate over the elf files without having to look for them
>> in the file-info index. Maybe it should die if nobody finds a use for
>> it.
> 
> This seems useful to me; I'll leave it in now in case we need it.  I'm
> applying the collection script, modified to add -a and exclude anything in
> /usr/lib/debug.  I've not applied the spelling check part.
> 

Ok; please re-consider adding the spell checking part and removing -a from
strings.

Cheers,
-- 
Raphael Geissert - Debian Maintainer
www.debian.org - get.debian.net





Reply to: