Bug#927362: ITP: blingfire -- lightning fast Finite State machine and REgular expression manipulation library
Package: wnpp
Severity: wishlist
Owner: Mo Zhou <lumin@debian.org>
* Package name : blingfire
Version : git-HEAD
Upstream Author : Microsoft
* URL : https://github.com/Microsoft/BlingFire
* License : MIT
Programming Lang: C++, Python, Perl, Batch, etc
Description : lightning fast Finite State machine and REgular expression manipulation library
Blingfire provides more than a fast natural language tokenizer. From the
benchmarking data its tokenizing speed seems to be much faster than that
of SpaCy or NLTK. Unlike NLTK or SpaCy, Blingfire seemingly works
without downloaded blobs. This tool might be useful to Enrico[1] as
well, and would possibly make him happy[2].
I'll first give it a try and put it to DUPR. And decide whether this
should really enter the archive after code inspection.
[1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=925294
[2] If we don't think too much about the upstream name.
Reply to: