Re: [Groff] Re: groff: radical re-implementation
> As regards line breaking algorithm, I think we need some more cflags,
> at least for Japanese. That is,
>
> - lines must not be broken before the character
> - lines must not be broken after the character
>
> These seems to be implemented as PRE_KINSOKU and POST_KINSOKU in
> jgroff, but it's done by hardcoded. I think this should be done by
> tmac.<lang>, so I think it's good idea to have some mechanisms to
> load language specific tmac files.
Which mechanism do you suggest?
> BTW, what do you think about code name for multibyte character/wide
> character or glyph code what you said? In jgroff, it seems it used
> wchar<EUCcode>.
I suggest that we follow the Adobe Glyph List (AGL):
http://partners.adobe.com/asn/developer/typeforum/unicodegn.html
This means that CJK glyph names would be uni<Unicode>, e.g. `uni635F'.
> jgroff provides "fixedkanji" directive in font description. But,
> the code of font description loader depends on EUC<->KuTen mapping,
> and it's not good idea for i18n. I think it would be better to
> provide "wcharset" directive which support code range. However,
> code range couldn't be used with EUC encoding or something like
> that, and not used for Unicode, because we couldn't expect character
> codes for some language are in succession.
It doesn't matter. A range directive like `wcharset' (an ugly name,
BTW) just tells troff that glyphs in a given the range have identical
metrics.
A better name is probably `glyphclass':
glyphclass <sample character> <range begin> <range end>
Example:
glyphclass uni4F34 uni4E00 uni9FFF
The sample glyph needs a real metrics entry.
> Anyway, jgroff provides new font "M" and "G", which are "Mincho" and
> "Gothic" respectively, for wide characters. What is the right way
> to add i18n support in groff about font description?
Basically, there is nothing to do. The only addition needed is a way
to make the font description files smaller, and this is just the
proposed `wcharset' (or `glyphclass') command.
> > > . Command lines shall be able to override input encoding
> > > (--input-encoding).
> > Yes.
>
> How about creating new request (.encoding "encoding-name") in roff
> source?
Good idea.
> Could you explain (or point us which source code in groff) how to map
> glyph IDs to output code, please?
As explained in a previous mail, a hard-coded mapping table from
glyph-names to Unicode output encoding is needed for tty devices.
For all other devices, nothing will change.
Werner
Reply to: