Bug#514495: [lib/Spelling.pm] check the spelling of large texts in a more efficient way
Raphael Geissert <atomo64@gmail.com> writes:
> Commit message explains it:
>> When spell checking large texts determine what's more convenient: to
>> look every word in the text for spelling mistakes or to look for known
>> spelling mistakes in the text.
>>
>> This should speed up checking large texts, with the only, minor,
>> consecuence being that only the first match of a spelling mistake is found
>> and warned about; but since the line numbers are not printed it is not big
>> deal.
>>
>> Additionally move some regular expressions and other operations so that
>> they are performed once for all the text, instead of doing it once on every
>> word.
Have you benchmarked this? My intuition says that if this makes any
difference at all, it will be a performance *degredation*. You're now
walking the entire text for every typo we know about instead of doing an
O(1) hash table lookup for each word. It's converting an O(n) check into
an O(n^2) check.
--
Russ Allbery (rra@debian.org) <http://www.eyrie.org/~eagle/>
Reply to: