[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: check_typos.pl script

Jens Seidel wrote:
On Sat, Oct 08, 2005 at 02:25:05AM +0200, Helmut Wollmersdorfer wrote:

I know, but since I had already many wrong possitives by a distance of 1
I never tried larger distances.

I use distance=1 mostly.

Long words may indeed contain multiple
typos so it's maybe a good idea to use a maximal distance of
length(word)/10. Especially German and a few other languages would profit.

But the false positives will be very high. E.g. with distance=2 each 2-letter word matches against _all_ other 2-letter words.

I don't know these algorithms,

Why invent the wheel?
If I want to solve a problem, then my first step is asking google or CPAN.

But I will definitively
test it (the code looks much cleaner, I'm a C/C++/Fortran77 coder not a
perl hacker :-))

I am coming from (Mainframe)Assembler and Prolog. Now Perl is my favorite since two years. IMHO it's worth to put your nose deeper into Perl.

Where is it available?

On my workstation, my laptop, USB-stick - unfortunately in different versions, ugly condition - I will send it per mail.

You refer to d-i, right? There are also many other English documents
which are not of a very high quality -:).

I know. But low quality docs only need only a spellchecker for sufficient 'bug hunting'. On high quality docs I watched myself loosing concentration and being demotivated after reading some hours without finding an error.

Also not every document use
'file system' instead of 'file-system' or 'filesystem' resp.

Just an example.

Helmut Wollmersdorfer

Reply to: