--- Begin Message ---
- To: Debian Bug Tracking System <submit@bugs.debian.org>
- Subject: kytea: bad package description
- From: Justin B Rye <justin.byam.rye@gmail.com>
- Date: Tue, 17 Sep 2013 16:52:09 +0100
- Message-id: <20130917155209.GA21687@xibalba.demon.co.uk>
Package: kytea
Version: 0.4.6+dfsg-2
Severity: wishlist
Tags: patch
This package has a terrible package description. (I'm still only
calling this a wishlist bug, though.)
> Package: kytea
[...]
> Description: morphological analysis system with pointwise predictors
Not bad, but the implementation details aren't worth putting in the
synopsis. What it does need to establish in the synopsis is that this
is for morphological analysis in the linguistic sense, rather than the
biological or marketing senses - see:
http://en.wikipedia.org/wiki/Morphological_analysis
The easy way to do that is to insert the word "text".
> KyTea is morphological analysis system based on pointwise predictors.
^
Missing article: KyTea is *a* morphological analysis system.
Implementation details are okay here, but talking about "pointwise
predictors" is excruciatingly esoteric. "Pointwise classifier" is
already specialist jargon; and stretching it you can talk about
pointwise classifier-based analyses making "pointwise predictions"
(though it's rather weird to talk about morphological judgements as
"predictions"). But you'll notice if you google the expression that
almost the entire top page of results is versions of this package
description! The home page has some clearer explanations that avoid
the phrase, and I recommend using some of that text.
> It separetes sentences into words, tagging and predict pronunciations.
^ ------- ^
Spelling: "separAtes".
"Tagging" on its own can be unclear; I would suggest "tagging parts of
speech".
"Predict" should be "predicting", or preferably (as on the home page)
"estimating". It's only a prediction if the word has never been
spoken before!
> The pronunciation of KyTea is same as cutie.
^
Missing article, and awkward phrasing - you could simply say when you
first mention KyTea that it's "pronounced 'cutie'". In fact, you
could start by explaining the name - it would at least serve to
distract slightly from the fact that somebody has made a rather bad
prediction of how "kytea" would be pronounced. The mention of Kyoto
would also be useful for another reason: the description so far gives
no hint of the fact that KyTea is specifically designed to handle
(unromanised, unspaced) Japanese script. That may seem obvious to
you, but if somebody is looking for a tool to do POS-tagging on the
collected works of Shakespeare then it's the job of this package
description to let them know that KyTea isn't what they're after.
> .
> This package contains predictor and training tool.
Likewise for the other packages in the set:
> Package: libkytea0
[...]
> Description: library of KyTea
Standardise these synopses.
> KyTea is morphological analysis system based on pointwise predictors.
> It separetes sentences into words, tagging and predict pronunciations.
> The pronunciation of KyTea is same as cutie.
As above.
> .
> This package contains shared libraries of KyTea.
Missing article. I would also make it "for" rather than "of".
> Package: libkytea-dev
[...]
> Description: library of KyTea : development files
As above. Colons in English never have preceding space, but I'd
recommend using a dash anyway.
> KyTea is morphological analysis system based on pointwise predictors.
> It separetes sentences into words, tagging and predict pronunciations.
> The pronunciation of KyTea is same as cutie.
> .
> This package contains development files of KyTea.
All as above.
My recommended replacement text:
| Package: libkytea0
[...]
| Description: text morphological analysis system - libraries
| The Kyoto Text Analysis toolkit (KyTea, pronounced "cutie") is a general
| morphological analysis system with a focus on Japanese, Chinese, and
| other languages requiring word or morpheme segmentation. It uses a
| pointwise classifier-based approach to split sentences into words,
| tagging parts of speech and estimating pronunciations.
| .
| This package contains the shared libraries for KyTea.
|
| Package: kytea
[...]
| Description: text morphological analysis system - binaries
| The Kyoto Text Analysis toolkit (KyTea, pronounced "cutie") is a general
| morphological analysis system with a focus on Japanese, Chinese, and
| other languages requiring word or morpheme segmentation. It uses a
| pointwise classifier-based approach to split sentences into words,
| tagging parts of speech and estimating pronunciations.
| .
| This package contains the predictor and training tool for KyTea.
|
| Package: libkytea-dev
[...]
| Description: text morphological analysis system - development libraries
| The Kyoto Text Analysis toolkit (KyTea, pronounced "cutie") is a general
| morphological analysis system with a focus on Japanese, Chinese, and
| other languages requiring word or morpheme segmentation. It uses a
| pointwise classifier-based approach to split sentences into words,
| tagging parts of speech and estimating pronunciations.
| .
| This package contains the development files for KyTea.
-- System Information:
Debian Release: jessie/sid
APT prefers testing
APT policy: (990, 'testing'), (50, 'unstable')
Architecture: i386 (i686)
Kernel: Linux 3.11-trunk-686-pae (SMP w/1 CPU core)
Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Versions of packages kytea depends on:
ii libc6 2.17-92+b1
ii libgcc1 1:4.8.1-2
ii libkytea0 0.4.6+dfsg-2
ii libstdc++6 4.8.1-2
kytea recommends no packages.
kytea suggests no packages.
-- no debconf information
--
JBR with qualifications in linguistics, experience as a Debian
sysadmin, and probably no clue about this particular package
diff -ru kytea-0.4.6+dfsg.pristine/debian/control kytea-0.4.6+dfsg/debian/control
--- kytea-0.4.6+dfsg.pristine/debian/control 2013-09-15 22:32:06.000000000 +0100
+++ kytea-0.4.6+dfsg/debian/control 2013-09-17 13:18:08.946414588 +0100
@@ -10,30 +10,36 @@
Section: libs
Architecture: any
Depends: ${shlibs:Depends}, ${misc:Depends}
-Description: library of KyTea
- KyTea is morphological analysis system based on pointwise predictors.
- It separetes sentences into words, tagging and predict pronunciations.
- The pronunciation of KyTea is same as cutie.
+Description: text morphological analysis system - libraries
+ The Kyoto Text Analysis toolkit (KyTea, pronounced "cutie") is a general
+ morphological analysis system with a focus on Japanese, Chinese, and
+ other languages requiring word or morpheme segmentation. It uses a
+ pointwise classifier-based approach to split sentences into words,
+ tagging parts of speech and estimating pronunciations.
.
- This package contains shared libraries of KyTea.
+ This package contains the shared libraries for KyTea.
Package: kytea
Architecture: any
Depends: ${shlibs:Depends}, ${misc:Depends}
-Description: morphological analysis system with pointwise predictors
- KyTea is morphological analysis system based on pointwise predictors.
- It separetes sentences into words, tagging and predict pronunciations.
- The pronunciation of KyTea is same as cutie.
+Description: text morphological analysis system - binaries
+ The Kyoto Text Analysis toolkit (KyTea, pronounced "cutie") is a general
+ morphological analysis system with a focus on Japanese, Chinese, and
+ other languages requiring word or morpheme segmentation. It uses a
+ pointwise classifier-based approach to split sentences into words,
+ tagging parts of speech and estimating pronunciations.
.
- This package contains predictor and training tool.
+ This package contains the predictor and training tool for KyTea.
Package: libkytea-dev
Section: libdevel
Architecture: any
Depends: libkytea0 (= ${binary:Version}), ${misc:Depends}
-Description: library of KyTea : development files
- KyTea is morphological analysis system based on pointwise predictors.
- It separetes sentences into words, tagging and predict pronunciations.
- The pronunciation of KyTea is same as cutie.
+Description: text morphological analysis system - development libraries
+ The Kyoto Text Analysis toolkit (KyTea, pronounced "cutie") is a general
+ morphological analysis system with a focus on Japanese, Chinese, and
+ other languages requiring word or morpheme segmentation. It uses a
+ pointwise classifier-based approach to split sentences into words,
+ tagging parts of speech and estimating pronunciations.
.
- This package contains development files of KyTea.
+ This package contains the development files for KyTea.
--- End Message ---