Re: [Groff] Re: groff: radical re-implementation
> A small part of the source code of Groff related to I/O has to be
> encoding-sensible. This part can handle Latin-1, EBCDIC, and UTF-8.
> Additionally, if Groff is compiled within internationalized OS
> (i.e. setlocale(), iconv(), nl_langinfo(), and so on are available),
> the part also has locale-sensible file I/O.
As mentioned in another mail I would rather like to have this moved to
the iconv preprocessor also. Latin-1, EBCDIC, and UTF-8 should be
hard-coded if glibc isn't available.
> Other almost part of the source code is written to handle UCS-4.
Maybe we are just arguing about terms, but UCS-4 makes only sense for
character codes, but groff handles glyphs... So UCS-4 shall *not* be
used internally in troff.
> For example: typedef long ucs4_t; and substitute char with ucs4_t.
Yes, a `long' type is the right one; this gives 31bits for glyphs and
negative values for the `special' characters in troff.
> Thus, valid encoding names for --input-encoding and --output-encoding
> command options are:
> - latin1, ascii, utf8, (ascii8) [compiled without I18N]
> - latin1, ascii, utf8, (ascii8), and encodings supported by OS
> [compiled with I18N]
You are thinking only of the tty device, aren't you? UTF-8 used as an
output encoding for a PS printer doesn't make any sense... If
possible please refer to the LaTeX model of input and output encodings
if you are talking about these terms.