Re: Bug#53981: language-chooser

To: mss@transas.com
Cc: debian-boot@lists.debian.org, 53981@bugs.debian.org
Subject: Re: Bug#53981: language-chooser
From: Taketoshi Sano <kgh12351@nifty.ne.jp>
Date: Sat, 15 Jan 2000 12:08:28 +0900
Message-id: <[🔎] 20000115120828X.xlj06203@xlj06203.nifty.ne.jp>
In-reply-to: <[🔎] 20000113182607.A32574@transas.com>
References: <[🔎] 20000113144301.A26270@transas.com> <[🔎] y5a7lhejd7j.fsf@kgh12351.nifty.ne.jp> <[🔎] 20000113182607.A32574@transas.com>

Hi. Thank you for your comment.

# btw, when I did cvs update last night, utilities/language-chooser/
# got lost. Why ? Is this intentional ? I hope we can use this one,
# it will be usefull introduction for users,,,

In <[🔎] 20000113182607.A32574@transas.com>,
 at Date: Thu, 13 Jan 2000 18:26:07 +0300,
  on Subject: Re: Bug#53981: language-chooser,
   Michael Sobolev <mss@transas.com> writes:

> Maybe.  There is a small problem though.  Or, to be more correct, a small issue
> to consider.  If we have an incomplete .trm file, it's better to give
> translations where possible.  While such a construction says: if we do not have
> a complete .trm file, just show everything in English.  I do not think users
> will find it good to have English messages after they choose the language they
> would like the system to show the messages in. :)

I had thought that, once we have done all translation, then we will get complete
 .trm files for all languages we work, so users will not find the (fall-backed)
English messages (hopefully). And, if the messages will be shown in English,
then it may be used as the indicator for un-updated .po files.

And, if most of messages are shown in selected languages, but some (minor) 
messages are shown in English, then it will be difficult to find the messages
are not all-translated by using the (made up for testing) floppies.

But, if all messages are shown in English, then testers using released (beta) 
floppies can not point out the untranslated messages correctly, so you are
right. (i.e. I was wrong)

  # Using the source .po file, it will be easy to find the "fuzzy" one
  # and "untranslated" one. so this "indicator" will not worth so much, anyway.

We should implement the scheme to use mixed messages from translated one
and original (English) one. This is the best way.

> We do not expect to have different number of strings.  This is by design.
> Look, we have a set of .c files in utilities/dbootstrap directory.  We build
> utilities/dbootstrap/po/dbootstrap.pot file (which is the source for C.po)
> file.  Now it's easy to update all translations.  Yes, the resulted translation
> may have certain items marked as "fuzzy", and certain items just are not
> translated.  BUT the number of messages is exactly the same.  And all these
> .trm files are generated at once.  I believe we are not going to separately
> supply trose .trm files, which simplifies the situation a lot.

In <[🔎] 20000104135223.A19516@artica.id-agora.net>,
 at Date: Tue, 4 Jan 2000 13:52:24 +0000,
  on Subject: Re: Bug#53981: language-chooser,
   Enrique Zanardi <ezanardi@id-agora.com> writes:

 | > Now for the second fold:
 | > 
 | > Sometimes a trm file is generated from an up-to-date ("msgmerged")
 | > but not fully translated LANG.po file. "msgfmt" (another gettext tool)
 | > doesn't write on the LANG.mo file messages that haven't been translated
 | > (or that are marked as "fuzzy") and so the LANG.trm file generated from
 | > that LANG.mo file will be treated by dbootstrap as if it were obsolete.

I check the source of gettext-0.10.35, and found this in src/msgfmt.c:

(snip)
      case 'f':
        include_all = 1;
        break;
(snip)
  if (msgstr_string[0] == '\0' || (!include_all && this->is_fuzzy))
    {
      if (verbose_level > 1)
        /* We don't change the exit status here because this is really
           only an information.  */
        error_at_line (0, 0, msgstr_pos->file_name, msgstr_pos->line_number,
                       (msgstr_string[0] == '\0'
                        ? _("empty `msgstr' entry ignored")
                        : _("fuzzy `msgstr' entry ignored")));

      /* Free strings allocated in po-gram.y.  */
      free (msgstr_string);

According to Enrique's mail, and this code, I think the untranslated
string in "updated" LANG.po file are not written into LANG.mo file
using msgfmt. Using option "-f" for msgfmt, we can use "fuzzy" strings,
but can not use unstranslated one. Empty messages will not work anyway,
so this is intentionally "designed" feature.

> > I think that loading both files may not work in the current .trm structure.
> > We have to change the procedure to generate .trm files to include English 
> > messages if the translated messages are not available.
> Exactly.
> 
> > It may be possible to use C.po and LANG.po to generate .trm files.
> It does not matter (if I understand correctly) what to use: .mo file or .po
> file as the former is just a compiled (read: not that wordy) of the latter.

Sure. using .po file is just my preference. I prefer to make pointerize
not to require intermediate (and unnecessary in the final stage; 
at the running time) .mo file at all. But this will take extra cost
(time and code), and now it is urgent time to find the reasonable solution,
so I changed my mind to use .mo files produced by msgfmt.

Now I plan to let trim-mo accept the optional second argument, 
and use it as a "Reference".

If trim-mo is given only one argument, it will work in compatibility
mode (as same as current one). If trim-mo is given two arguments,
then it will open both of 1st (the specified LANG.mo) and 2nd 
(the reference, C.mo in the case of dbootstrap).

Using reference, it extracts the correct number of messages,
when the number of messages are wrong (more or less) in LANG.mo.

Then, it extract the original strings from the reference,
and search the counterparts from specified LANG.mo.

I think we can use the binary search method from dcgettext.c
of gettext. (intl/dcgettext.c in the source tree of gettext)
This method will take longer time than using hash-table method,
but it is more simple and short. (does not require many code.)

I will work for this approach.

-- 
  Taketoshi Sano: <kgh12351@nifty.ne.jp>

Reply to:

Follow-Ups:
- Re: Bug#53981: language-chooser
  - From: Michael Sobolev <mss@transas.com>

References:
- Re: Bug#53981: language-chooser
  - From: Michael Sobolev <mss@transas.com>
- Re: Bug#53981: language-chooser
  - From: Taketoshi Sano <kgh12351@nifty.ne.jp>
- Re: Bug#53981: language-chooser
  - From: Michael Sobolev <mss@transas.com>

Prev by Date: Re: Bug#53981: language-chooser
Next by Date: perl and modconf
Previous by thread: pointerize 0.4
Next by thread: Re: Bug#53981: language-chooser
Index(es):
- Date
- Thread