[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [Groff] Re: groff: radical re-implementation

> 2. Perhaps it is a good point of view to see troff (gtroff) as an
> engine which handles _glyphs_, not characters, in a given context of
> typographic style and layout. The current glyph is defined by the
> current point size, the current font, and the name of the
> "character" which is to be rendered, and troff necessarily takes
> account of the metric information associated with this glyph.

Exactly.  But the current terminology in gtroff is more than
ambiguous, and I believe that we need a clear separation between
characters and glyphs.

> Logically, therefore, troff could be "neutral" about what the byte
> "a" stands for. From that point of view, a troff which makes no
> assumptions of this kind, amd which consults external tables about
> the meaning of its input and about the characteristics of what
> output that input implies, purely for the purpose of correct
> formatting, is perhaps the pure ideal. And from that point of view,
> therefore, unifying the input conventions on the basis of a
> comprehensive encoding (such as UTF-8 or Unicode is intended to
> become) would be a great step towards attaining this neutrality.

I fully agree.  A single input character set (as universal as
possible) is the right thing, and everything else shall be managed by
preprocessors (and a postprocessor for tty).

> Meanwhile, interested parties who have not yet studied it may find
> the "UTF-8 and Unicode FAQ for Unix/Linux" by Markus Kuhn well worth
> reading:
>   http://www.cl.cam.ac.uk/~mgk25/unicode.html

Yes, Markus is doing an excellent job.

> By the way, your comment that hyphenation, for instance, is not a
> "glyph question" is, I think, not wholly correct. Certainly,
> hyphenation _rules_ are not a glyph question: as well as being
> language-dependent, there may also be "house rules" about it; these
> come under "typographic style" as above. But the size of a hyphen
> and associated spacing are glyph issues, and these may interact with
> where a hyphenation occurs or whether it occurs at all, according to
> the rules.

I mean the algorithm of finding possible breakpoints which must be
based on input characters.  The final decision where a word will be
broken is of course a glyph issue.


Reply to: