Re: [Groff] Re: groff: radical re-implementation
> JIS X 0213 has many characters which are also included in JIS X 0212.
> It is very confusing. I guess JIS people think JIS X 0212 is
Basically, only Emacs supports JIS X 0212...
> A few characters in JIS X 0213 are not included in the present
AFAIK, this will be fixed (or have already been fixed?) in the next
Unicode release where more than 10000 CJK characters are added (in the
> > > Then the 'font definition file' will be irrationally large.
> > Right. I think I've answered this problem in my last mail (regarding
> > a `glyphclass' directive in font description files).
> Then all of these glyphs have to have the same width.
Why? It is intended that `glyphclass' can occur multiple times. Say,
one glyphclass command for full-width glyphs, and another one for
> > Indeed, the default behaviour should be that the preprocessor adds
> > a
> > .mso tmac.<charset>
> > line or something similar to the document, but there must be a
> > possibility to override it manually.
> Good idea. Thus '#ifdef I18N' part can be restricted in pre/post-
> Overriding? Well, the current Groff has '-a' option. I think this
> can be used for this purpose. (Anyway, we can provide substitution
> only for non-letter symbols like soft-hyphen, '(C)', circles,
> squares, and so on. I think this is sufficient.)
The `-a' option is almost useless today IMHO. It will show a tty
approximation of the typeset output:
groff -a -man -Tdvi troff.man | less
It is *not* the right way to quickly select an ASCII device. To
override the used macros for the output character set we need a new
Using `-a' is comparable to dvi2tty or similar converters.
> We have to think about uniting my idea on design of preprocessor
> and Ukai's idea of '.encoding "encoding-name"' in roff source.
> - it is the preprocessor that handles the ".encoding" .
> - priority is that
> * --input-encodings wins.
> * .encoding is next.
> * then falling into the default (locale-sensible for i18n OS
> and latin-1 for non-i18n OS).
Exactly. Compare this to the Emacs model of `local variables'.
Note that such an encoding request has to determine the encoding *and*
character set of a document (similar to Emacs).
I suggest that we don't use `.encoding' but
-*- charset-encoding: xxx -*-
in the first comment block (almost similar to Emacs). troff shouldn't
notice encoding issues at all and just accept UTF-8.
If really necessary, we can add two additional commands to select
encoding and character separatedly:
-*- charset: ...; encoding: ... -*-
.\" -*- charset: JIS-X-0208; encoding: EUC -*-
.\" -*- charset: JIS-X-0208; encoding: ISO-2022 -*-