[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#375441: ITP: python-stemmer -- Python bindings for Snowball stemming algorithms



Package: wnpp
Severity: wishlist
Owner: Franz Pletz <fpletz@franz-pletz.org>

* Package name    : python-stemmer
  Version         : 1.0.1
  Upstream Author : Richard Boulton <richard@tartarus.org>
* URL             : http://snowball.tartarus.org/
* License         : BSD, MIT
  Programming Lang: C, Python
  Description     : Python bindings for Snowball stemming algorithms

PyStemmer provides access to efficient algorithms for calculating a
"stemmed" form of a word.  This is a form with most of the common
morphological endings removed; hopefully representing a common
linguistic base form.  This is most useful in building search engines
and information retrieval software; for example, a search with stemming
enabled should be able to find a document containing "cycling" given the
query "cycles".

PyStemmer provides algorithms for several (mainly european) languages,
by wrapping the libstemmer library from the Snowball project in a Python
module.

It also provides access to the classic Porter stemming algorithm for
english: although this has been superceded by an improved algorithm, the
original algorithm may be of interest to information retrieval
researchers wishing to reproduce results of earlier experiments.



Reply to: