[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Translating manual pages [was Re: Migrate website translations to PO]

[Moving this thread to debian-l10n-german as suggested; I just
subscribed so that you do not have to CC me, but I do not speak German
(ich hatte Deutsch gesprochen aber ich habe alles vergessen)]

On 2010/8/16 Helge Kreutzmann <debian@helgefjell.de>:
> Hello Denis,
> I'm not doing this myself, but I CC'ed our coordinator for the German
> man page project.
> On Mon, Aug 16, 2010 at 02:16:16AM +0200, Denis Barbier wrote:
>> On 2010-08-11, Helge Kreutzmann wrote:
>> [...]
>> > And a final nitpick: Which translation teams do have the man power to
>> > do the conversion? The German team currently works on moving the
>> > text based translations of man pages to po based ones, and this turns
>> > out to be a huge effort. For the website, we at least know if a file
>> > is up to date, but I guess still quite some effort is required (or we
>> > hope that paragraph n in the original corresponds to paragraph n in
>> > the translation and mass convert without review).
>> According to http://lists.debian.org/debian-l10n-french/2006/05/msg00409.html
>> Thomas Huriaux and myself (IIRC Thomas did it almost alone) converted
>> 1590 French manual pages into PO files during a single weekend, so
>> this operation does not require a large team.  In fact, it depends
>> mostly on 2 factors:
>>   a. Whether translations are correctly identified (ie. you know on
>> which version each page is based on, ideally all pages are based on
>> the same version of manpages)
> Well, in the past man pages varied from partial translations, over
> outdated translations to several rewrites. In early Linux time it
> seemed (at least for German) like a good idea to take part of standard
> linux volumes and distribute it as man pages, ignoring maintenance and
> the original versions (I did not watch it at that time, this is my
> impression, still). So each man page has to be considered if some
> parts can be reused or not.

I made some tests tonight with manpages-de (more below) and it seems
that the situation is not as bad as I thought, a significant part of
strings can be automatically extracted.

>>   b. And your ability to use po4a and gettext tools to speed up
>> gettextization; we played a lot with other manual pages (coreutils,
>> findutils, etc) before converting the manpages package.
> From my experience, this Toddy is able to do. For coreutils man pages
> are generated (and hence translated) help2man, so I'm not sure what
> you mean with them.

Right, that one was not a good example.  Here are better ones: grep,
cron diffutils, sysvinit, tar, util-linux, etc ;)

>> If translations are not correctly identified, well you are out of
>> luck, but this is due to your current manual pages (maintenance is
>> almost impossible anyway) and not po4a.  If you have trouble, you may
>> ask for advice on our development list about po4a
>>   http://lists.alioth.debian.org/pipermail/po4a-devel/
>> maybe some people there could help you with this process.
> I never intended to blame po4a, I explicitly stated that maintance is
> clear for the website. Until recently, man page maintenance was a big
> unsolved puzzle for the German team. There is a reason I submitted
> #493007.

Waow, great idea, thanks for pushing this up.

> And thank you very much for the offer!
>> We imported a complete French translation of manpages 1.69, but this
>> version was 18 months old (and upstream made lots of changes at that
>> time), and we took about 4 months with a team of 6-7 translators to
>> synchronize with the Debian manpages package (2.39-1).  After that,
>> maintenance became much easier, we have been able to keep up with
>> upstream manpages with a team of 2-3 people.
> This is the reason for our switch as well. Once the basic ground is
> covered, we can maintain them much more easily. Just that we cannot
> start with a completed, clearly defined set.

Maybe you can.  As said above, I played tonight with manpages-de.  I
could not find old upstream man-pages tarballs, so I went to
http://snapshot.debian.org/package/manpages/ to download 1.11, 1.15,
1.19, 1.22, 1.29, 1.70 and 2.40 and unpacked these tarballs.
I also downloaded latest manpages-de, then ran from within manpages-de:

   for f in man5/*.5; do
      test -f ../man-pages-2.40/$f || continue
      for d in $(ls -d ../man-pages* | sort -r); do
          po4a-gettextize -f man -m $d/$f -M iso-8859-1 -l $f  -L iso-8859-1 \
             -p $f.de.po && break

When po4a-gettextize succeeds, it is likely that msgid and msgstr are
synchronized.  Of course there is no warranty, so files have to be
carefully reviewed.
Then I had a closer look at failing pages, and had to modify pages to
run po4a-gettextize (this is the real tricky part).

When all pages are converted to PO files, I concatenated all strings
and remove fuzzy flags:
  $ msgcat --use-first man5/*.5.de.po | msgattrib --clear-fuzzy >
  $ LC_ALL=C msgfmt -c -v -o /dev/null man5/man5-de.po
  man5/man5-de.po:7: some header fields still have the initial default value
  748 translated messages.
Eventually I updated man5 pages to man-pages 3.25:
  $ msgmerge --previous -U man5/man5-de.po
  $ LC_ALL=C msgfmt -c -v -o /dev/null man5/man5-de.po
  man5/man5-de.po:7: some header fields still have the initial default value
  258 translated messages, 404 fuzzy translations, 1448 untranslated messages.

I uploaded man5-de.po.bz2 into

The reason for removing fuzzy flags with msgattrib is that you can
then know which strings have been modified upstream.  As already said,
all strings must be carefully reviewed though.

When working on French translations, we found more convenient to use
one PO file per section (sections 2, 3 and 7 have to be split into 2
or 3 files because they are too large); you can decide to use one PO
file per manual page if you prefer.
I strongly suggest to first convert all pages before fixing PO files,
because some pages have been moved by upstream from one section into
another one, you will notice only after updating to 3.25 (in
man5-de.po, one can see that complex.5, filesystems.5 and ipc.5 have
been moved), and will reproduce steps above after moving files, thus
fixes in man*-de.po files might get lost.

If you want, I can help with converting other sections to PO files.
You are not forced to use perkamon if you do not want to, it is quite
easy to dispatch or gather PO files with standard gettext tools.  I
still believe that this will help you to focus on translating and not
on administrating files, do not hesitate to ask if you have any
You should also decide whether you want to translate upstream manual
pages, or Debian ones.  IMO it is better to first focus on upstream
manual pages to have a broader audience.


Reply to: