Re: I/O for different encodings

To: debian-i18n@lists.debian.org
Cc: "Itai Zukerman" <zukerman@math-hat.com>, debian-devel@debian.or.jp
Subject: Re: I/O for different encodings
From: Taketoshi Sano <sano@debian.org>
Date: 10 Nov 2000 10:35:06 +0900
Message-id: <[🔎] y5azoj8txrp.fsf@kgh12351.nifty.ne.jp>
In-reply-to: <[🔎] E13tzkw-0000yK-00@rakefet> (Shaul Karl's message of "Fri, 10 Nov 2000 00:01:10 +0200")
References: <[🔎] 877l6d2jml.fsf@matt.w80.math-hat.com> <[🔎] E13tzkw-0000yK-00@rakefet>

Hi.

In <[🔎] E13tzkw-0000yK-00@rakefet>,  on Fri, 10 Nov 2000 00:01:10 +0200,
 Shaul Karl <shaulka@bezeqint.net> wrote:

> > I'm working on a piece of software that will parse textual data (a
> > list of words), conduct some statistical analyses, and spit out more
> > textual data.  I'd like to support multiple languages, maybe even
> > multibyte encodings.  Can someone please point me towards some
> > resources, in particular how to handle text input and output in a
> > language-independent way?  As you can probably guess, I'm new to i18n.

> Not sure but I believe that everything is in the process of convergence to 
> Unicode (UTF8). Therefore, if I would have written such a program I would make 
> it to use this encoding.
> As for resources, there is a Unicode HOWTO on the LDP and many other resources 
> on the net.

I think the support of UTF8 is a minimum (or essential) requirement
for i18n especially in multibyte encodings.  There are some software
which claims the unicode support but does not support multibye encodings
correctly.  (So these "unicode supported" software can not handle some
languages which includes Japanese.)

If you can add support for more encodings then it is better than 
to support unicode only, but we can use some translation filter
(such as iconv(1), tcs(1), or so) to process text data, so the support
of Unicode (or UCS-4 which is better) is workable compromise, maybe.

Maybe You can read the discussion about i18n of groff on this list
from the web archive recently.  I think it has something useful
for you.

Regards.
-- 
  Taketoshi Sano: <sano@debian.org>,<sano@debian.or.jp>,<kgh12351@nifty.ne.jp>

Reply to:

References:
- I/O for different encodings
  - From: Itai Zukerman <zukerman@math-hat.com>
- Re: I/O for different encodings
  - From: Shaul Karl <shaulka@bezeqint.net>

Prev by Date: Re: I/O for different encodings
Next by Date: Re: I/O for different encodings
Previous by thread: Re: I/O for different encodings
Next by thread: Re: I/O for different encodings
Index(es):
- Date
- Thread