Bug#988829: ITP: r-cran-tokenizers -- GNU R fast, consistent tokenization of natural language text

To: Debian Bug Tracking System <submit@bugs.debian.org>
Subject: Bug#988829: ITP: r-cran-tokenizers -- GNU R fast, consistent tokenization of natural language text
From: Andreas Tille <tille@debian.org>
Date: Thu, 20 May 2021 09:53:48 +0200
Message-id: <[🔎] 162149722821.4086688.17851835660617864676.reportbug@energija.fam-tille.de>
Reply-to: Andreas Tille <tille@debian.org>, 988829@bugs.debian.org

Package: wnpp
Severity: wishlist

Subject: ITP: r-cran-tokenizers -- GNU R fast, consistent tokenization of natural language text
Package: wnpp
Owner: Andreas Tille <tille@debian.org>
Severity: wishlist

* Package name    : r-cran-tokenizers
  Version         : 0.2.1
  Upstream Author : Lincoln Mullen,
* URL             : https://cran.r-project.org/package=tokenizers
* License         : MIT
  Programming Lang: GNU R
  Description     : GNU R fast, consistent tokenization of natural language text
 Convert natural language text into tokens. Includes tokenizers for
 shingled n-grams, skip n-grams, words, word stems, sentences,
 paragraphs, characters, shingled characters, lines, tweets, Penn
 Treebank, regular expressions, as well as functions for counting
 characters, words, and sentences, and a function for splitting longer
 texts into separate documents, each with the same number of words.
 The tokenizers have a consistent interface, and the package is built
 on the 'stringi' and 'Rcpp' packages for fast yet correct
 tokenization in 'UTF-8'.

Remark: This package is maintained by Debian R Packages Maintainers at
   https://salsa.debian.org/r-pkg-team/r-cran-tokenizers

Reply to:

Prev by Date: Bug#985170: marked as done (ITP: libsdl1.2-compat -- SDL 1.2 binary compatibility library wrapping SDL 2.0)
Next by Date: Bug#988831: ITP: r-cran-janeaustenr -- Jane Austen's complete novels for GNU R
Previous by thread: Bug#985170: marked as done (ITP: libsdl1.2-compat -- SDL 1.2 binary compatibility library wrapping SDL 2.0)
Next by thread: Bug#988831: ITP: r-cran-janeaustenr -- Jane Austen's complete novels for GNU R
Index(es):
- Date
- Thread