Re: [RFR] English debconf templates uim-ajax-ime and uim-social-ime

To: debian-l10n-english@lists.debian.org
Cc: debian@vdr.jp
Subject: Re: [RFR] English debconf templates uim-ajax-ime and uim-social-ime
From: Justin B Rye <jbr@edlug.org.uk>
Date: Mon, 20 Jun 2011 10:53:08 +0100
Message-id: <20110620095308.GA27498@xibalba.demon.co.uk>
Mail-followup-to: debian-l10n-english@lists.debian.org, debian@vdr.jp
In-reply-to: <20110620031721.GA23174@lilith.infoblue.home>
References: <20110614044352.GA24554@lilith.infoblue.home> <20110614083600.GA28848@xibalba.demon.co.uk> <20110614125522.GA2139@xibalba.demon.co.uk> <20110620031721.GA23174@lilith.infoblue.home>

debian@vdr.jp wrote:
> Upstream says "uim" is not "UIM".  So, I did s/UIM/uim/g.

But note that they use "Uim" at the beginning of a sentence.

>> +# but what does the "Pinyin [...] Hangul [...]" bit actually mean?
> 
> Do you mean "Why do Pinyin and Hangul have explanation such as Chinese/Korean
> input method? Anthy, SKK, Canna and T-Code/TUT-Code are just proper noun"?
> Do I have to add explanation for all input methods?
> Or remove Pinyin and Hangul explanations?

Go back a step.  You also need to explain that:
 (a) the list is a list of input methods;
 (b) "Pinyin (Chinese input method)" is short for "which is an input
     method for Chinese" (and likewise for Korean); and
 (c) the default "explanation" for items that don't have one is
     "(which is an input method for Japanese)".
None of this is obvious to readers who don't already know it!  After
all, pinyin, hangul, and IPA weren't invented as "input methods";
they're scripts.  Yes, Pinyin is *also* the name of an input method,
but it's not self-evident that this is what you're saying.

Trying to make this clearer, I'd suggest something like:

  Uim is an input method module library which supports various scripts and can
  act as a front end for a range of input methods, including Anthy, Canna,
  SKK, or T-Code/TUT-Code (for Japanese), Pinyin (for Chinese), Byeoru (for
  Korean), and X-SAMPA (for the International Phonetic Alphabet). Most of its
  functions are implemented in Scheme, so it's very simple and flexible.

(I've also sorted Anthy, Canna, SKK, and T-Code/TUT-Code into
alphabetical order; mentioned uim-byeoru instead of uim-hangul; and
used the name X-SAMPA because that's the system being used as a
shorthand for IPA input.  Calling it "IPA" is like calling all the
Japanese ones "Kanji".)

>>  Package: uim-gtk2.0
> 	:
>> - This package contains a input method module on GTK+2.0.
>> + This package contains an IM-module for GTK+2.0.
>> +# WHAT IS THAT AND WHY SHOULD I INSTALL IT?
> 
> -----------------------------------
> This package contains an IM-module for GTK+2.0.
> You can use uim on GTK+2.0 applications.
> -----------------------------------

Or just
   This package contains an IM-module to support the use of uim on GTK+2.0
   applications.

>>  Package: uim-fep
> 	:
>> - This package is a FEP (Front End Processor) on curses.
>> + This package provides a curses front end for UIM.
>> +# WHAT IS THAT AND WHY SHOULD I INSTALL IT?
> 
> -----------------------------------
> This package provides a curses front end for uim.
> You can use uim on console.
> -----------------------------------

The problem here is that all the available vocabulary for explaining
this sort of thing is ambiguous (like "front end") or technical jargon
(like "curses") or both (like "console").

I'm assuming that when you say "console" you don't mean that it
specifically needs to be launched in a VT login of its own (like
startx); presumably it just provides a text user interface that can be
run within an x-terminal-emulator.  But it's still hard to visualise
how it works.  For instance, if I want to use this front end to type
in IPA transcriptions over an SSH connection, where should I install
uim-fep?

Perhaps:
   This package contains a curses Front End Processor to support the use of
   uim in a text terminal.

(Is this in fact an IM-module?)

>>  Package: uim-anthy
> 	:
>> - This package contains Anthy plugin for uim.
>> + This package contains an Anthy plugin for UIM.
>> +# WHAT IS THAT AND WHY SHOULD I INSTALL IT?
> 
> -----------------------------------
> This package contains an Anthy plugin for uim.
> You can use Japanese input method Anthy on uim.
> -----------------------------------

Since it "Depends: anthy", which has its own package description, I
won't expect it to give all the details about working via hiragana.

Perhaps:
   This package contains a plugin for uim to support the use of the Japanese
   input method Anthy.
Then likewise for
   input method Canna.
   input method SKK.
   input method PRIME.

(There's no Debian binary package called skk, but it should be clear
enough.)

>>  Package: uim-m17nlib
> 	:
>> - This package contains m17nlib plugin for uim.
>> + This package contains an m17nlib plugin for UIM.
>> +# WHAT IS THAT AND WHY SHOULD I INSTALL IT?
> 
> -----------------------------------
> This package contains an m17nlib plugin for uim.
> You can input text supported by m17nlib on uim.
> -----------------------------------

The tricky part with this one is spotting that the crucial dependency
is libm17n-0 - once I've found that (and counted the letters in
"ultilingualizatio") things get much clearer.

Perhaps:
   This package contains a plugin for uim to support the use of the
   general-purpose input method M17n (for "Multilingualization").

(I've called it "M17n" because the library's official name is "m17n",
not "m17nlib"; because other packages such as scim-m17n use the name
"M17n" to refer to the input method; and because I'm assuming all
names of input methods get capitalised.)

>>  Package: uim-byeoru
> 	:
>> + This package provides UIM support for byeoru hangul.
>> +# "byeoru" means "inkstone", which is no help
>> +# WHAT IS THAT AND WHY SHOULD I INSTALL IT?
> 
> -----------------------------------
> This package provides uim support for byeoru hangul.
> You can use Korean input method byeoru on uim.
> -----------------------------------

There are no dependencies to chase for further information, but
googling tells me that "the Byeoru Hangul input suite covers most of
the major input methods such as 2-beol and 3-beol variants".  So what
*is* the difference between this and uim-hangul?  And what exactly is
an input *suite*?

Aha - a FreeBSD port in 2005 mentions that
# The 'byeoru' module (developed by Jae-hyeon Park) is far more
# superior to default hangul2/hangul3 module. And it was merged to 0.5
# version of uim, and will serve as the default hangul input module in
# the future.
Is that future now the present?  Perhaps it should be:

Perhaps:
   This package contains a plugin for uim to support the use of the Byeoru input
   module for hangul (uim's default IM for Korean).

>>  Package: uim-hangul
> 	:
>> + This package provides UIM support for multiple Hangul input styles: 2-beol,
>> + 3-beol, and Romaja.
>> +# whatever that is; at least it's obvious who'd use it
> 
> 2-beol, 3-beol and Romaja are Hangul input styles,
> but I do not know what their's details are.
> I think it is enough description...

It's not clear what an "input style" is, or why three of them would be
grouped together as a single "input module".  Besides, uim-byeoru
apparently supports exactly the same "styles" (in one "suite"), and is
now apparently preferred.  Perhaps it shold just be:
   This package contains a plugin for uim to support the use of the old Hangul
   input module for Korean.

>>  Package: uim-latin
> 	:
>> - This package contains Latin and Germanic languages input style for uim.
>> + This package provides UIM support for languages written in Latin scripts.
>> +# composing diacritics?  Does it cover Icelandic?  Rumanian?  Czech?
> 
> I think uim-latin supports composing diacritics
> and covers Icelandic from uim-latin's configulation.
> But it does not cover Rumanian and Czech from it, I think.

My suspicion was that this stuff about Latin and Germanic languages
was the usual languages-versus-scripts confusion, and that uim-latin
supports more or less every Latin-derived character with a Unicode
code point.  Looking at /usr/share/uim/latin.scm that seems to be true.

> There is no document for uim-latin (ELatin).
> 
> http://code.google.com/p/uim/wiki/WhatsUIM

It might at least be useful to mention that it *is* "ELatin", and that
the E stands for Emacs!

I gather that it provides composing rules for characters unique to
Icelandic, Hungarian, Vietnamese... it really seems to cover the lot.
So my first rough guess seems to have been accurate - though maybe we
could expand on it:

   This package contains a plugin for uim to support the use of the (Emacs)
   Latin input method, which provides composing sequences for accented and
   otherwise modified Roman-alphabet letters.

>>  Package: uim-pinyin
> 	:
>> - This package contains Pinyin input method(Simplified Chinese, Traditional
>> - Chinese and Unicode) for uim.
>> + This package provides UIM support for pinyin input (for Simplified or
>> + Traditional Chinese).
>> +# the Unicode reference seems like a level error
> 
> uim-pinyin has three input methods: py (Simplified Chinese),
> pyunihan (Unicode) and pinyin-big5 (Traditional Chinese).

Now I'm even more confused.  How is Unicode an input method, unless
maybe you've got a keyboard with 100,000 keys?  Don't all of these
work by *inputting* pinyin, and *outputting* Chinese characters in one 
of three encodings?

I can't suggest a revised version yet.

> -----------------------------------
> This package provides uim support for pinyin input: py (Simplified Chinese),
> pyunihan (Unicode) and pinyin-big5 (Traditional Chinese).
> -----------------------------------
> 
>>  Package: uim-tcode
> 	:
>> + This package provides UIM support for T-Code/TUT-Code/Try-Code input (for
>> + Japanese).
>> +# for kanji, though how it works is a mystery...
> 
> Here is the unofficial T-Code information page in English.
> http://www.ki.nu/~makoto/tcode/

So it maps each pair of alphanumeric characters to a common kanji
character, and you use it by... memorising the whole table?  Ouch.
Does anyone know why it's called by all these names?  Can we use
"T-Code" as a cover term for all of them?

> -----------------------------------
> This package provides uim support for T-Code/TUT-Code/Try-Code input.
> T-Code/TUT-Code/Try-Code is a kind of direct input method of Japanese
> with two strokes - see http://openlab.jp/tcode/ (Japanese).
> -----------------------------------

Perhaps:
  This package provides uim support for T-Code/TUT-Code/Try-Code input, a
  Japanese input method mapping pairs of alphanumeric codes to individual
  kanji - see http://openlab.jp/tcode/ (in Japanese).

(I don't object to the references being in Japanese as long as they're
labelled as such.)

>>  Package: uim-viqr
> 	:
>> - This package contains Vietnamese Quoted-Readable input style for uim.
>> + This package provides UIM support for Vietnamese Quoted-Readable input.
>> +# whatever that is; at least it's obvious who'd use it
> 
> -----------------------------------
> This package provides uim support for VIQR (Vietnamese Quoted-Readable)
> input. VIQR is a mnemonic encoding of Vietnamese characters into US ASCII
> for use on 7-bit systems - see RFC1456.
> -----------------------------------

Oh!  So it's more like X-SAMPA (or the exact opposite of ELatin).  And
this description is good just as it is (or am I just getting tired?)

>>  Package: uim-ipa-x-sampa
> 	:
>> - This package contains International Phonetic Alphabet (X-SAMPA) input style
>> - for uim.
>> + This package provides UIM support for International Phonetic Alphabet input
>> + using X-SAMPA.
>> +# well, everybody knows what that is, right?
> 
> ----------------------------------
> This package provides uim support for IPA (International Phonetic Alphabet)
> input using X-SAMPA (Extended Speech Assessment Methods Phonetic Alphabet)
> see http://www.phon.ucl.ac.uk/home/sampa/x-sampa.htm
> -----------------------------------

The expansion of X-SAMPA isn't very useful (it isn't obvious that it's
talking about an "Extended version of the SAMPA encoding", and anyone
who cares can look at ucl.ac.uk); more important is the fact that it's
seven-bit clean.  I would suggest:

  This package provides uim support for the International Phonetic Alphabet,
  using the 7-bit extended-SAMPA system - see
  http://www.phon.ucl.ac.uk/home/sampa/x-sampa.htm

>>  Package: uim-yahoo-jp
> 	:
>> - This package contains Yahoo-JP (Web API) input style for uim.
>> - See: http://developer.yahoo.co.jp/webapi/jlp/jim/v1/conversion.html
>> + This package provides UIM support for Japanese input via the Yahoo-JP web
>> + API - see http://developer.yahoo.co.jp/webapi/jlp/jim/v1/conversion.html
>> +# seeing that page won't help if I can't read Japanese
>> +# I see no "https" here - doesn't this need a warning too?
>> +# it's completely unclear what a web Input Method Environment is
> 
> There is no English page for Yahoo-JP (Web API).
> It does not support "https", so I add warning.
> 
> -----------------------------------
> This package provides uim support for Japanese input via the Yahoo-JP web
> API - see http://developer.yahoo.co.jp/webapi/jlp/jim/v1/conversion.html
> Note that all requests to the Yahoo-JP server go over the Internet
> unencrypted.

Okay - phew!  Thanks for going through all that.
-- 
JBR	with qualifications in linguistics, experience as a Debian
	sysadmin, and probably no clue about this particular package

Reply to:

Follow-Ups:
- Re: [RFR] English debconf templates uim-ajax-ime and uim-social-ime
  - From: debian@vdr.jp

References:
- [RFR] English debconf templates uim-ajax-ime and uim-social-ime
  - From: debian@vdr.jp
- Re: [RFR] English debconf templates uim-ajax-ime and uim-social-ime
  - From: Justin B Rye <jbr@edlug.org.uk>
- Re: [RFR] English debconf templates uim-ajax-ime and uim-social-ime
  - From: Justin B Rye <jbr@edlug.org.uk>
- Re: [RFR] English debconf templates uim-ajax-ime and uim-social-ime
  - From: debian@vdr.jp

Prev by Date: Re: [RFR] English debconf templates uim-ajax-ime and uim-social-ime
Next by Date: Re: [RFR] English debconf templates uim-ajax-ime and uim-social-ime
Previous by thread: Re: [RFR] English debconf templates uim-ajax-ime and uim-social-ime
Next by thread: Re: [RFR] English debconf templates uim-ajax-ime and uim-social-ime
Index(es):
- Date
- Thread