[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#962799: ITP: snowball-data -- test data for Snowball stemming algorithms

Package: wnpp
Severity: wishlist
Owner: Dmitry Shachnev <mitya57@debian.org>

* Package name    : snowball-data
  Version         : 0+20191003-1
  Upstream Author : Snowball developers <snowball-discuss@lists.tartarus.org>
* URL             : https://github.com/snowballstem/snowball-data
* License         : BSD-3-clause, GPL-3+, CC-BY-SA-3.0, CC-BY-SA-4.0
  Programming Lang: none
  Description     : test data for Snowball stemming algorithms

Snowball provides access to efficient algorithms for calculating a
"stemmed" form of a word.  This is a form with most of the common
morphological endings removed; hopefully representing a common
linguistic base form.  This is most useful in building search engines
and information retrieval software; for example, a search with stemming
enabled should be able to find a document containing "cycling" given the
query "cycles".

Snowball provides algorithms for several (mainly European) languages.
It also provides access to the classic Porter stemming algorithm for
English: although this has been superseded by an improved algorithm, the
original algorithm may be of interest to information retrieval
researchers wishing to reproduce results of earlier experiments.

This package contains the test data, which is used by Snowball test suite.

In March I discussed it with snowball maintainer on IRC and he said that
the test data should be a separate source package.

Dmitry Shachnev

Attachment: signature.asc
Description: PGP signature

Reply to: