Spell checking of POD documentation (was: [SCM] Debian package checker branch, master, updated. 2.5.12-13-g7b947e6)
On 2013-04-20 14:20, Jakub Wilk wrote:
> * Niels Thykier <niels@thykier.net>, 2013-04-20, 12:13:
>> -uses I<pure> logic to determine if dependencies are satisifies (i.e. it
>> +uses I<pure> logic to determine if dependencies are satisfies (i.e. it
>
> s/satisfies/satisfied/
>
I was thinking of adding a test to find misspelled words in our POD
documentation. Apparently we^W I need it to keep our documentation
somewhat readable.
Attached is the script I used to find all of the mistakes in our
codebase that was corrected in commit 0af16e4 (using aspell/aspell-en as
the underlying checker). It needed quite a bit of whitelisting to get
the test to pass - apparently it trips on quite a few argument names for
various subs.
Do any of you have any useful tips on setting up and managing these
spell checking tests?
~Niels
#!/usr/bin/perl
use strict;
use warnings;
use Test::More;
use Test::Lintian;
eval 'use Test::Spelling';
plan skip_all => "Pod spell checking requires Test::Spelling" if $@;
my @GOOD_WORDS = qw(
Allbery Barratt Braakman Brockschmidt Geissert Lichtenheld Niels Russ
Thykier
lintian Lintian Lintian's dpkg libapt debian Debian DEBIAN
PTS QA qa uploader uploaders UPLOADER Uploaders changelog changelogs
desc COND CURVALUE subdirectory subdirectories udeb deb dsc nlist olist
KEYN BASEDIR METADATA OO TODO dir exitcode nohang substvar substvars
listref metadata blockingly checksum checksums Nativeness
src nativeness Indep debfiles diffstat gz env classpath conffiles objdump
tasksel filename Pre pre hardlink hardlinking hardlinks PROC dirs PROFNAME
CHECKNAMES COLLMAP ERRHANDLER LPKG unpacker worklist BASEPATH stderr stdout
stdin ascii html issuedtags subclasses showdescription printables overridable
processables msg ORed SIGKILLs SIGTERM wildcard wildcards ar whitelist blacklist
API amd armhf cpu linux whitelisted blacklisted shaX sha rstrip lstrip parsers
customisation ALGO CLOC CMD DEBFILE DEST DSCFILE FOH NOCLOSE PARENTDIR PGP
STARTLINE STR UTF bitmask cp debconf rw proccessable severities AND'ing
superset YYYY dirname operm username Whitespaces whitespace Whitespace
udebs multiword recognised eqv testsuite methodx multi multiarch relationA
relationB Multi natively unordered
);
# md is md5 butchered by aspell
push(@GOOD_WORDS, 'md');
# 'soft'ly which was parsed as soft'ly.
push(@GOOD_WORDS, q{soft'ly});
# This is wrong in general, but it happens to be a package name that
# we use as an example.
push(@GOOD_WORDS, 'alot');
add_stopwords(@GOOD_WORDS);
chdir($ENV{'LINTIAN_ROOT'})
or die("fatal error: could not chdir to $ENV{LINTIAN_ROOT}: $!");
my @CHECKS = glob('checks/*[!.]*[!c]');
my @DIRS = qw(collection doc/tutorial frontend lib private reporting t/scripts t/helpers);
all_pod_files_spelling_ok(@CHECKS, @DIRS, 't/runtests');
Reply to: