[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [RFR] python-levenshtein Description [ITP]



Hello,

On Mon, Mar 31, 2008 at 02:32:56PM +0100, Justin B Rye wrote:
> 
> So for a Levenshtein distance of 67:
> 
> Description: extension for computing string similarities and edit distances
>  The Levenshtein module computes Levenshtein distances, similarity ratios,
>  generalized medians and set medians of ASCII or Unicode strings. Because
>  it's implemented in C, it's much faster than the corresponding Python
>  library functions and methods.

Wow, I can't recognize it!

Thanks a lot for all these cleanups.

I wonder if "ASCII or Unicode" is correct (I think it should work with any
encoding). Unicode strings have a different type in Python, so the
information that it support both is interesting (there are other python
extension that do not support this feature).

Would "Unicode and non-Unicode strings" be OK.
(or "normal and Unicode strings")


I also liked the idea of indicating in what area this can be used, so
mentioning spell checker and fuzzy matching would be nice for searches.


I propose to add the following two paragraphs:

 The Levenshtein distance is the minimum number of insertion, deletion, or
 substitution of single characters to transform one string into the other.

 It is useful in applications that need to determine how similar two
 strings are, such as spell checkers or fuzzy matching of gettext messages.

> JBR	with qualifications in linguistics, experience as a Debian
> 	sysadmin, and probably no clue about this particular package

It seems you have. Thanks for your reviews (Christian also).

Best Regards,
-- 
Nekral


Reply to: