[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#514495: [lib/Spelling.pm] check the spelling of large texts in a more efficient way



Raphael Geissert <atomo64@gmail.com> writes:

> Commit message explains it:
>>     When spell checking large texts determine what's more convenient: to
>> look every word in the text for spelling mistakes or to look for known
>> spelling mistakes in the text.
>>
>>     This should speed up checking large texts, with the only, minor,
>> consecuence being that only the first match of a spelling mistake is found
>> and warned about; but since the line numbers are not printed it is not big
>> deal.
>>
>>     Additionally move some regular expressions and other operations so that
>> they are performed once for all the text, instead of doing it once on every
>> word.

Have you benchmarked this?  My intuition says that if this makes any
difference at all, it will be a performance *degredation*.  You're now
walking the entire text for every typo we know about instead of doing an
O(1) hash table lookup for each word.  It's converting an O(n) check into
an O(n^2) check.

-- 
Russ Allbery (rra@debian.org)               <http://www.eyrie.org/~eagle/>



Reply to: