[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#896712: ITP: golang-github-dataence-porter2 -- Native Go English Porter2 stemmer



Good night time Aaron,

Reviewing the performance claims, I don't see data validating the 660% speed up.  Also, having looked at libdivsufsort and SACA-K, I think the code could use some more comparisons.  It does look neat though.  Also, why is this going through the bug list?

I am not a mod, so don't worry all too much about me.

Package: wnpp
Severity: wishlist
Owner: "Aaron M. Ucko" <ucko@debian.org>

* Package name    : golang-github-dataence-porter2(-dev)
   Version         : 0.0~git20150829.0.56e4718
   Upstream Author : Jian Zhen / Dataence, LLC
* URL             : https://github.com/dataence/porter2
* License         : Apache 2.0
   Programming Lang: Go
   Description     : Native Go English Porter2 stemmer

Porter2 implements the english Porter2 stemmer.  It is written
completely using finite state machines to do suffix comparison, rather
than the string-based or tree-based approaches.  As a result, it is
660% faster compared to string comparison-based approach.

I intend to package this library under the auspices of debian-med for
the sake of recent ncbi-entrez-direct releases.  However, if somebody
from the Go team wants to take this package over, they're certainly
welcome to it. ;-)



Reply to: