[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [Groff] Re: groff: radical re-implementation

> A small part of the source code of Groff related to I/O has to be
> encoding-sensible.  This part can handle Latin-1, EBCDIC, and UTF-8.
> Additionally, if Groff is compiled within internationalized OS
> (i.e. setlocale(), iconv(), nl_langinfo(), and so on are available),
> the part also has locale-sensible file I/O.

As mentioned in another mail I would rather like to have this moved to
the iconv preprocessor also.  Latin-1, EBCDIC, and UTF-8 should be
hard-coded if glibc isn't available.

> Other almost part of the source code is written to handle UCS-4.

Maybe we are just arguing about terms, but UCS-4 makes only sense for
character codes, but groff handles glyphs...  So UCS-4 shall *not* be
used internally in troff.

> For example: typedef long ucs4_t; and substitute char with ucs4_t.

Yes, a `long' type is the right one; this gives 31bits for glyphs and
negative values for the `special' characters in troff.

> Thus, valid encoding names for --input-encoding and --output-encoding
> command options are: 
>   - latin1, ascii, utf8, (ascii8)  [compiled without I18N]
>   - latin1, ascii, utf8, (ascii8), and encodings supported by OS
>     [compiled with I18N]

You are thinking only of the tty device, aren't you?  UTF-8 used as an
output encoding for a PS printer doesn't make any sense...  If
possible please refer to the LaTeX model of input and output encodings
if you are talking about these terms.


Reply to: