Re: [Groff] Re: groff: radical re-implementation

To: debian-i18n@lists.debian.org, groff@ffii.org
Subject: Re: [Groff] Re: groff: radical re-implementation
From: Tomohiro KUBOTA <tkubota@riken.go.jp>
Date: Tue, 24 Oct 2000 14:06:18 +0900
Message-id: <[🔎] 87lmvekeut.wl@surfchem0.riken.go.jp>
In-reply-to: In your message of "Mon, 23 Oct 2000 09:42:18 +0200 (CEST)" <[🔎] 20001023.094218.59476979.wl@gnu.org>
References: <[🔎] 14832.31111.196218.91839F@surfchem0.riken.go.jp> <[🔎] 20001021.104651.85678878.wl@gnu.org> <[🔎] 87itqkwbet.wl@surfchem0.riken.go.jp> <[🔎] 20001023.094218.59476979.wl@gnu.org>

Hi,

At Mon, 23 Oct 2000 09:42:18 +0200 (CEST),
Werner LEMBERG <wl@gnu.org> wrote:

> >  - hard-coded converter from Latin1, EBCDIC, and UTF-8 to UTF-8
> >  - locale-sensible converter from any encodings supported by OS to UTF-8
> >    (note: UTF-8 has to be supported by iconv(3) )
> 
> May I suggest that you temporarily implement a hack so that you can
> use it with the Japanese patch of groff?  I don't know how long it
> will take until the necessary changes for gtroff have been
> implemented.

What do you think about the merit of preprocessor with the current
Groff which doesn't recognize UTF-8 input?

I think the preprocessor can contribute Groff to be locale-sensible.
However, groff wrapper or troff will need some mechanism to receive
a report on locale from the preprocessor.

The algorithm will be: check locale and use
 - -Tlatin1 for Latin-1 languages
 - -Tnippon for Japanese
 - -Tascii8 for other languages
if groff wrapper is invoked with -Ttty.  (IMO, we should not override
user's specification of -Tlatin1, -Tascii, -Tnippon, and so on).

BTW, do you plan to release Groff with Japanese patch, with my
preprocessor, as a makeshift until Groff with UTF-8 input will be
available?  (I thought so since you seem to be interested in my
preprocessor working with Japanese-patched Groff. :-)

> > BTW, besides TTY output, HTML will need postprocess from glyph to 
> > character like 'grotty' in tty mode, since HTML is a text file.
> 
> Yes and no.  HTML output also supports entities with the &...;
> directive.

Either (UTF-8 or &...;) will be OK.
Eigher have their own merits and demerits.
HTML output will be a ASCII text with &...; .  ASCII is the most
portable character set/encoding in the world.  However, reading
HTML source with &...; will be hard if the most part of the text
consists from non-ASCII characters, such as Japanese, Russian, 
and Greek.

---
Tomohiro KUBOTA <kubota@debian.org>
http://surfchem0.riken.go.jp/~kubota/

Reply to:

Follow-Ups:
- Re: [Groff] Re: groff: radical re-implementation
  - From: Werner LEMBERG <wl@gnu.org>

References:
- Re: [Groff] Re: groff: radical re-implementation
  - From: Tomohiro KUBOTA <tkubota@riken.go.jp>
- Re: [Groff] Re: groff: radical re-implementation
  - From: Werner LEMBERG <wl@gnu.org>
- Re: [Groff] Re: groff: radical re-implementation
  - From: Tomohiro KUBOTA <tkubota@riken.go.jp>
- Re: [Groff] Re: groff: radical re-implementation
  - From: Werner LEMBERG <wl@gnu.org>

Prev by Date: Re: [Groff] Re: groff: radical re-implementation
Next by Date: Re: [Groff] Re: groff: radical re-implementation
Previous by thread: Re: [Groff] Re: groff: radical re-implementation
Next by thread: Re: [Groff] Re: groff: radical re-implementation
Index(es):
- Date
- Thread