[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

RFC: shorten translatable messages, eliminate fuzzy, and more



Salam,

Self-introduction: I seem to be currently supervising Russian-related
i18n of dpkg, dispatching translations in doc/ru/ and updating ru.po
from time to time.


I would like to discuss the proposal of splitting gettextable messages
in dpkg in lesser chunks, probably of string or couple of strings in
length.

My primary example of why this should be done is in main/main.c, line
58 (there is a translatable string that is more that two screenfuls
long).

Strings of such a "length" are extremely hard to update in .po-files.
There is mostly no way to see what exactly changed: you will have to
keep somewhere an exact copy of dpkg sources that were used for last
update of .po-file, and look at the diffs between two sources.  It is
not so convenient, especially given that .po-translators tend to
change.  During the last update I had to simply compare string with
its translation line-by-line, and update accordingly.  When the
"string" occupies three screens, and its translation occupies another
three screens, the entire construction sucks big time.


I understand that this change is pretty much intrusive and expansive,
and will require cooperation on part of all translators.  But the
current situation is slightly wrong (I say that facing 370
untranslated lines in just-cvs-updated ru.po).


Another near-standing issue is fuzzy translations that occur right
after updating of .po-files.  In short, fuzzy translations are evil.
I really did fixed fuzzy translation of string "Verifying package"
that was substituted with translation for "Removing package".  I can
imagine what did the admin that saw such a message felt. :)

I'll send patch removing eight fuzzy entries in ru.po shortly, and
encourage every other translator to at least fix those.  It could be
really devastating both in terms of UI and wrt to issues like a string
"foo%d" being substituted with "foo%s".  I've fixed _lots_ of those in
ru.po, and you can look for yourself in any other .po-file.  E.g., from
pt_BR.po:

   #: lib/mlib.c:199
   #, fuzzy
   msgid "failed in buffer_write(fd) (%i, ret=%zi %s)"
   msgstr "falha ao copiar na leitura (%s)"


I really think that this issue should be raised to the level of
gettext developers.

While it is not resolved, I would recommend Wichert _not_ to update
.po-file, unless it is sent by a respective translator.


As for the downsizing of .po-files, I think that we could actually
_not_ mark _every_ string as translatable.  There is a lot of messages
that are actually "should never happen"-style.  E.g., I really think
that "unable to ignore signal %d before running %.250s" should always
be in English.  That way, the innocent user can simply cut-and-paste
the error message, and file a bug report, without re-running the
failed operation with LC_MESSAGES=C (and you will have to explain that
operation to him first!  and the failure could be not reproducible!).

I do not propose scanning all the sources and un-gettextizing the
debugging-only messages.  I'm just proposing that care should be taken
when making a message translatable, and the existing message be slowly
migrated back to hard-coded.


Another, related issue, is that as far as I know GNU-style string
continuations 

	"foo \
	 bar"

are disallowed by C99 and are going to be deprecated (and then
unsupported) in some near gcc.  Splitting of gettextable lines is
going to help that too.



Comments?


I probably could help make necessary modifications to bring my
proposals to life.  However, I'd like to restrict myself to updating
ru.po only, and not touching anything else.

--alexm



Reply to: