[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [Groff] Re: groff: radical re-implementation

> As regards line breaking algorithm, I think we need some more cflags,
> at least for Japanese.  That is,
>    - lines must not be broken before the character
>    - lines must not be broken after the character
> These seems to be implemented as PRE_KINSOKU and POST_KINSOKU in
> jgroff, but it's done by hardcoded.  I think this should be done by
> tmac.<lang>, so I think it's good idea to have some mechanisms to
> load language specific tmac files.

Which mechanism do you suggest?

> BTW, what do you think about code name for multibyte character/wide
> character or glyph code what you said?  In jgroff, it seems it used
> wchar<EUCcode>.

I suggest that we follow the Adobe Glyph List (AGL):


This means that CJK glyph names would be uni<Unicode>, e.g. `uni635F'.

> jgroff provides "fixedkanji" directive in font description.  But,
> the code of font description loader depends on EUC<->KuTen mapping,
> and it's not good idea for i18n.  I think it would be better to
> provide "wcharset" directive which support code range.  However,
> code range couldn't be used with EUC encoding or something like
> that, and not used for Unicode, because we couldn't expect character
> codes for some language are in succession.

It doesn't matter.  A range directive like `wcharset' (an ugly name,
BTW) just tells troff that glyphs in a given the range have identical

A better name is probably `glyphclass':

  glyphclass <sample character> <range begin> <range end>


  glyphclass uni4F34 uni4E00 uni9FFF

The sample glyph needs a real metrics entry.

> Anyway, jgroff provides new font "M" and "G", which are "Mincho" and
> "Gothic" respectively, for wide characters.  What is the right way
> to add i18n support in groff about font description?

Basically, there is nothing to do.  The only addition needed is a way
to make the font description files smaller, and this is just the
proposed `wcharset' (or `glyphclass') command.

> > >   . Command lines shall be able to override input encoding
> > >     (--input-encoding).
> > Yes.
> How about creating new request (.encoding "encoding-name") in roff 
> source?

Good idea.

> Could you explain (or point us which source code in groff) how to map
> glyph IDs to output code, please?

As explained in a previous mail, a hard-coded mapping table from
glyph-names to Unicode output encoding is needed for tty devices.

For all other devices, nothing will change.


Reply to: