[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [RFR] python-levenshtein Description [ITP]



My gateway server's idea of an April Fools' Day prank is a hard
drive failure.  It's a good thing I was building a replacement.

Nicolas François wrote:
> Justin B Rye wrote:
>> Description: extension for computing string similarities and edit distances
>>  The Levenshtein module computes Levenshtein distances, similarity ratios,
>>  generalized medians and set medians of ASCII or Unicode strings. Because
>>  it's implemented in C, it's much faster than the corresponding Python
>>  library functions and methods.
[...]
> I wonder if "ASCII or Unicode" is correct (I think it should work with any
> encoding). Unicode strings have a different type in Python, so the
> information that it support both is interesting (there are other python
> extension that do not support this feature).
> 
> Would "Unicode and non-Unicode strings" be OK.
> (or "normal and Unicode strings")

I can't advise on the issue of which description is the best fit for
the way it works in Python, but I would suggest keeping the "or"
instead of another (potentially confusing) "and".

>  The Levenshtein distance is the minimum number of insertion, deletion, or
>  substitution of single characters to transform one string into the other.

Minimum number, but still plural!  Make it:

   The Levenshtein distance is the minimum number of single-character
   insertions, deletions, and substitutions to transform one string into
   another.

>  It is useful in applications that need to determine how similar two
>  strings are, such as spell checkers or fuzzy matching of gettext messages.

"Spell checkers" are (software) applications; "fuzzy matching of
gettext messages" isn't, so it's unbalanced.  How about just:

   It is useful for spell checking, or fuzzy matching of gettext messages.

(I'm astonished to find that "spellcheckers" is wrong - Google says
"Did you mean: spell checkers?" - even though they aren't checkers
for spells, they're things that perform spellchecking.  Oh well...)
-- 
JBR	with qualifications in linguistics, experience as a Debian
	sysadmin, and probably no clue about this particular package


Reply to: