[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [Groff] Re: groff: radical re-implementation


At Wed, 25 Oct 2000 10:09:34 +0200 (CEST),
Werner LEMBERG <wl@gnu.org> wrote:

> Hmm.  What about the following temporary solution:
>   Run the preprocessor twice; the first time it is called directly by
>   the groff program, and it returns an error code which groff can use
>   to set the -T switch.  The proprocessor shouldn't do any processing
>   except checking the locale setting and testing the first line(s) of
>   the input file for an `encoding' directive'.
>   The second time just do the normal pipeline.

I think such a locale-sensibility is achieved with small code,
even without any OS's locale-related functions such as setlocale(3)
and so on.  Just like:

char *locale;
locale = getenv("LC_ALL");
if (locale == NULL) locale=getenv("LC_CTYPE");
if (locale == NULL) locale=getenv("LANG");

struct lang_and_device {
  char *language;
  char *device;
} lang_table[] = {
  {"da", "latin1"},  /* Danish */
  {"de", "latin1"},  /* German */
  {"en", "latin1"},  /* English */
  {"es", "latin1"},  /* Spanish */
  {"fi", "latin1"},  /* Finnish */
  {"fr", "latin1"},  /* French */
  {"ga", "latin1"},  /* Irish */
  {"is", "latin1"},  /* Icelandic */
  {"it", "latin1"},  /* Italian */
  {"ja", "nippon"},  /* Japanese */
  {"nl", "latin1"},  /* Dutch */
  {"no", "latin1"},  /* Norwegian */
  {"pt", "latin1"},  /* Portuguese */
  {"sv", "latin1"},  /* Swedish */

struct lang_and_device *lang;
char *device = "ascii8";
for (lang=lang_table; lang->language != NULL; lang++) {
  if (!strncmp(locale, lang->language, 2)) device = lang->device;

This code is for groff wrapper.  When groff wrapper is invoked with
command option of '-Ttty', this algorithm should be used.

 - 'utf8' device is not covered with this algorithm since the
   current troff doesn't support utf-8 input.  Since the current
   Groff cannot handle encodings for input output separately,
   it is appropriate that the both encodings for input and output
   are assumed to be the same.  However, the current implementation
   of 'utf8' device is ISO8859-1 input and UTF-8 output.
 - 'tty' is mapped to 'latin1', 'ascii8', and 'nippon' devices.
   It is obvious that this algorithm is useless for the official
   groff which does not support 'ascii8' nor 'nippon'.  Thus,
   if the official groff are to be locale-sensible, we should
   at least support 'ascii8' device.
 - 'ascii' device cannot be used for this purpose because even
   English-speaking people sometimes use ISO-8859-1 charsets.
   (Yes, people all over the world often read English manpages.
   Thus, I think it is evil to use ISO-8859-1 character in
   English manpages.  However, we are talking about user's
   locale environment, not about the language of the roff source.)

How do you think about this code?

Tomohiro KUBOTA <kubota@debian.org>

Reply to: