[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [Groff] Re: groff: radical re-implementation


At Wed, 18 Oct 2000 16:54:53 +0200 (CEST),
Werner LEMBERG <wl@gnu.org> wrote:

> > However, thank you for explaining glyph.  I also understand you 
> > understand problems on Japanese character codes well. 
> Well, I'm the author of the CJK package for LaTeX, I've written a
> ttf2pk converter, and I'm a member of the FreeType core team :-)

Great!  An another great thing is that you explain basic things to me
so kindly though you are so busy on such many heavy projects...

> I know these problems too well -- AFAIK, in JIS X 0208 these two
> variants are unified.  Do you know details about the new JIS X 0213
> standard?

I heard that JIS X 0208 unites two 'high or tall' variants because
of a policy that JIS X 0208 is a code for characters, not for glyphs.
However, it can change with years whether a set of glyphs are considered
as variants of one character or different characters.

JIS X 0213 has many characters which are also included in JIS X 0212.
It is very confusing.  I guess JIS people think JIS X 0212 is obsolete.
I think it needs time for JIS X 0213 to be popular.  I even don't know
whether JIS X 0213 will become popular or not (like JIS X 0212).
A few characters in JIS X 0213 are not included in the present Unicode.

> > Then the 'font definition file' will be irrationally large.  I think
> > at least CJK ideographics and Korean precompiled Hanguls have to be
> > treated in different way.  (Ukai has already pointed this problem.
> > jgroff uses 'wchar<EUCcode>' for glyph names of Japanese
> > characters.)
> Right.  I think I've answered this problem in my last mail (regarding
> a `glyphclass' directive in font description files).

Then all of these glyphs have to have the same width.  Fortunately,
CJK ideograms and Korean Hanguls have fixed-width glyphs.

> I don't think so.  For example, we could restrict to MIME character
> set tags which are standardized.

I think a table is needed anyway.  This is because Groff should accept
implementation-dependent locale names so that Groff works well together
with other softwares on the OS.  I think we can design Groff so that 
it accepts both implementation-dependent locale names and MIME character
set tags, since it is not likely that a name in implementation-dependent
locale names calls a different encoding in MIME character set tags, nor
vice versa.

> Indeed, the default behaviour should be that the preprocessor adds a
>   .mso tmac.<charset>
> line or something similar to the document, but there must be a
> possibility to override it manually.

Good idea.  Thus '#ifdef I18N' part can be restricted in pre/post-

Overriding?  well, the current Groff has '-a' option.  I think
this can be used for this purpose.  (Anyway, we can provide
substitution only for non-letter symbols like soft-hyphen, '(C)',
circles, squares, and so on.  I think this is sufficient.)

We have to think about uniting my idea on design of preprocessor
and Ukai's idea of '.encoding "encoding-name"' in roff source.

 - it is the preprocessor that handles the ".encoding" .
 - priority is that
   * --input-encodings wins.
   * .encoding is next.
   * then falling into the default (locale-sensible for i18n OS
     and latin-1 for non-i18n OS).

Tomohiro KUBOTA <kubota@debian.org>

Reply to: