[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Translation of init scripts



This weekend I sat over gettext thinking of translated
initscript messages. This is a proposal for some practical
solution, so I'm mailing -devel-announce, but please followup on
-devel [and send me CC, since I don't plan to subscribe to
-devel again]. Also, please leave -user-portuguese out of
followups; I'm doing this CC to there because I promised to :-)

The gettext package comes with a standalone "gettext" util, wich
can act as an "echo" replacement. With this, and a
"debian-init-script.mo" file in the correct locale dir, we can
get translated messages. I wonder why we didn't start this
before; with debian-jp etc, and the number of -de|-es|-fr
developers, I think it's an important feature.


  HOW
*******

There are four layers to the problem.

1: the selection of the language

2: the display of the messages, retaining compatibility with
existing versions

3: the packages and dependencies involved

4: the translations themselves


Layer 1: selecting the language
===============================

The language is selected by the LANG, LC_ALL or LC_MESSAGES
environment variable. Most sites set one of these globally in
/etc/environment or /etc/profile, with the option of individual
users setting something else in their shell load scripts (for
example on sites that offer telnet/ssh accounts).

However, I'm not very sure about sourcing /etc/environment in
the init scripts. It may set some other potentially nasty
things. Also, you may want to keep the system's default language
as C (English) but have the boot in your language, since you're
the only one who's going to see it. So I think perhaps the best
solution would be either of:

a: /etc/defaults/language, to be sourced by the "parent" init
   scripts (/etc/init.d/rc, for example - all scripts started by
   init, not by some other scripts)
b: hack init itself to set one of the variables, reading the
   value from inittab or even a boot parameter (/proc/cmdline)

These have the added value that the "init" language isn't
necessarily the same as the "system default" language; so if
your site is hosted in Hungary but has telnet/ssh accounts
available, you may set the "init" language to hr and get
"LANG=C; export LANG" in /etc/environment.

Then if you do `/etc/init.d/something reload' the messages will
come out in the "system default" language. Magic :-)


Layer 2: displaying the messages
================================

All "echo" commands in the scripts must somehow call something
internationalized - "gettext -s" or something debian-specific.
Se layers 3 and 4 for the pros and cons of each.

The way I did it for my weekend test was editing all scripts
which produced output in /etc/init.d and /etc/rc.boot; before
they output anything, add:

SETLANG=/etc/init.d/setlang.sh
if [ -f $SETLANG ]; then
  . $SETLANG
else
  echo='echo -e'
fi

This isn't the best way IMHO, as I said in layer 1, but it's
good enought for now. Of course I later realized I could source
$SETLANG only in the "parent" scripts.

The setlang.sh script currently just does:

. /etc/default/language
export TEXTDOMAIN=debian-init-scripts
echo="/usr/bin/gettext -s -e"

And then I replaced all "echo" for "$echo". Wow, it works.

The rationale for /etc/init.d/setlang.sh is that it could come
with the "gettext" package; so, if "gettext" isn't installed,
things proceed as usual.

A second approach is to split the standalone "gettext" program
into its own package and promote it to base/required. Then
change all "echo" for "gettext -s" - isn't much more trouble
than changing them for "$echo".

A third approach is to come up with a "printf"-like
internationalized program and package it in base/required. The
"whys" are in layer 4.


Layer 3: packages and dependencies
==================================

This is what I was just talking about; there should be some
minimal backward compatibility. Each of the three approachs
above have its own problems and sollutions.

First approach ($echo) already solves its own problem. If
gettext is not installed, it is not used. This is somewhat
similar to the way Debian is used to doing things, IMHO.

Second and third approachs put the necessary packages in "base"
section and flag them "required" priority. But the priority only
works if dpkg knows about the package! So perhaps, for a while,
updated packages should "pre-depend" on the new package.

And just because I'm sure someone will ask: as long as the
program exists, it will work just fine if it doesn't find
translated messages. It just displays them in English.


Layer 4: translating the messages
=================================

This poses a problem. Of course our language teams and even
users would quickly volunteer, but we should do our best to make
the job easier - after all, there must be a LOT of scripts to
translate. If people must translate all of

"Starting foo bar server: foo"
"Stopping foo bar server: foo"
"Starting time waster: twt"
"Stopping time waster: twt"

the work is a lot bigger than if we just ask them to translate

"Starting"
"Stopping"
"foo bar server"
"time waster"

right?

Most packages (according to current policy) have something like:
"Starting my_description: my_daemon" (later) "."
"Stopping my_description: my_daemon" (later) "."
"Restarting my_description: my_daemon" (later) "."
"Reloading my_description configuration files" (later) "."

For gettext to have more hints about it, we would have to spell
the first part of these (the period stays as is) like:
"Starting" "my_description:" "my_daemon"
"Stopping" "my_description:" "my_daemon"
"Restarting" "my_description:" "my_daemon"
"Reloading" "my_description" "configuration files"

Old packages have instead:

"Starting my_description... " (later) "done."

It would have to become:

"Starting" "my_description... " (later) "done."

but of course if someone's going to edit the script, it might
just as well be updated to the new policy.

But this imposes major hacks on us, because of two situations:

Situation 1: the colon, ellipsis etc
------------------------------------

As you may see in the example above, the translator would need
to translate both "my_description" and "my_description:". This
sucks. Not because it's more work - actually it will be a matter
of copy-paste most times - but because it's a waste of space,
and hackers don't like things like this.

The kludge I made looks like:

"Starting" "my_description" "\b: my_daemon"

but then I either have to change "$echo" for "$echo -e" or set
it to `echo="echo -e"' without gettext and
`echo="gettext -s -e"' with it. Also, of course, it's ugly.

Situation 2: the "Reloading configuration files" messages
---------------------------------------------------------

When you do "/etc/something reload", the new policy says it
should print something like

"Reloading my_description configuration files" (later) "."

which would become

"Reloading" "my_description" "configuration files"

for gettext. The problem is that this fixes a sentence order; in
many languages, there's no way to make this sentence make sense.
Portuguese (my language) is one of them.

In C we would do something like

printf("Reloading %s configuration files", description);

and then the translation could do anything to that, like

msgid "Reloading %s configuration files"
msgstr "Reloading configuration files for %s"

or in Portuguese

msgstr "Recarregando arquivos de configuração de %s"

and this works.

Sollutions
----------

Situation 1 could be solved by an "echo" replacement that didn't
insert spaces between its parameters; so the string would become

"Starting" " " "my_description" ": my_daemon"

but this does nothing for situation 2.

A printf-like echo replacement would work for both, but would be
a pain to implement and yet another thing for maintainers to
learn/worry about.

So I don't have an answer :-)

The potfiles
------------

Either way, I suggest that all the messages be kept in a single
Debian-wide potfile, to be named "debian-init-scripts.mo".

The compiled, translated files ("*.mo") should, IMVHO, go in the
"locales-XX" packages ("locales-de", "locales-es" etc), since
they already exist. This saves space for users - they only get
the language(s) they need.

I can send you the .pot file with all the few messages I already
have (collected from my home, 150M-installation Debian scripts).
But I'd rather leave it for later when we come up with a
decision.

Oh... and finally, I think each language-team and
locales-XX-maintainer should come to a decision regarding the
maintenance of the translations. Some will want to send diffs to
the maintainer, others will prefer CVS, perhaps even use the
lists. Whatever, I don't care.

[]s,
                                               |alo
                                               +----
--
      I am Lalo of deB-org. You will be freed.
                 Resistance is futile.

http://www.webcom.com/lalo      mailto:lalo@webcom.com
                 pgp key in the web page

Debian GNU/Linux       --        http://www.debian.org


Reply to: