[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Bug#1019435: ITP: rapidfuzz -- rapid fuzzy string matching



On Fri, Sep 09, 2022 at 10:00:49AM +0100, Julian Gilbey wrote:
> Package: wnpp
> Severity: wishlist
> Owner: Julian Gilbey <jdg@debian.org>
> X-Debbugs-Cc: debian-devel@lists.debian.org, debian-devel@lists.debian.org, debian-python@lists.debian.org
> 
> * Package name    : rapidfuzz
>   Version         : 2.6.1
>   Upstream Author : Max Bachmann <pypi@maxbachmann.de>
> * URL             : https://github.com/maxbachmann/RapidFuzz
> * License         : MIT
>   Programming Lang: Python
>   Description     : rapid fuzzy string matching
> 
> RapidFuzz is a fast string matching library for Python and C++, which
> uses the string similarity calculations from
> [FuzzyWuzzy](https://github.com/seatgeek/fuzzywuzzy).  However there
> are a couple of aspects that set RapidFuzz apart from FuzzyWuzzy:
> 1) It is MIT licensed so it can be used whichever License you might want to choose for your project, while you're forced to adopt the GPL license when using FuzzyWuzzy
> 2) It provides many string_metrics like hamming or jaro_winkler, which are not included in FuzzyWuzzy
> 3) It is mostly written in C++ and on top of this comes with a lot of Algorithmic improvements to make string matching even faster, while still providing the same results. For detailed benchmarks check the [documentation](https://maxbachmann.github.io/RapidFuzz/fuzz.html)
> 4) Fixes multiple bugs in the `partial_ratio` implementation
> 
> This is a dependency of the latest upstream release of
> python3-textdistance.
> 
> There are also two C++ libraries contained within this package,
> managed as separate GitHub subrepositories.  One is rapidfuzz-cpp, and
> I am not sure whether to bundle this as a single package or whether to
> package this independently.  The other is taskflow
> (https://github.com/taskflow/taskflow)), a C++ header-only package; I
> think this should probably be packaged separately.
> 
> This package will be maintained within the Python Packaging Team.

I should have added: this package depends on Cython >= 3.0.0a7, so
this cannot be packaged until we have the new version of Cython
available.  The same applies for the JaroWinkler package.

   Julian


Reply to: