On Sunday 13 January 2008, Christian Perrier wrote:
> Quoting Frans Pop (elendil@planet.nl):
> > For now I'm mostly interested in discussion of the issues I mention in
> > the comments in the big new section (which basically replaces the old
> > section that follows it).
>
> > + # We need the date of the last update of a sublevel PO file
>
> Yes.
>
> > + # Preferably we should also determine the name of the person who
> > + # did the last update to a sublevel (for changelogs)
>
> That would certainly be better. I fear it could complicated the code
> quite a lot and, indeed, the translation is mostly a team work. I
> personnally don't give much importance to Last-Translator.
>
> So, well, if we find a *not too complicated* way to allow for
> different last-translator, why not. But I don't think it's worth a
> great effort.
With some generalized functions it wasn't "too complicated" :-)
And I do think it's important to at least /try/ and get the correct
translator into our changelogs.
Are there gettext alternatives to the po_print_header() and po_print_body()
functions? If there are, I think that would be preferred, but note that my
functions select/remove both leading comments and all headers.
> > + # When updating a sublevel PO file, we should really retain
> > + # all the old headers and only update the POT-Creation-Date...
>
> Yes, definitely. There may be specific comments, or whatever
Done.
> > + # Do we really want to loose obsolete strings?
> > + # Shouldn't that be up to the translator?
> > + msgattrib --width=79 --no-obsolete sublevel${i}/${lang}.po.new
> > >sublevel${i}/${lang}.po
>
> I should have put a comment when I added this. I know there was a
> reason..:-|
Some fancy footwork was needed, but it looks like I've got a working
implementation for this. All obsolete strings are not gathered in the
sublevel1 PO file.
Attached a new version of the patch. I think this solves all issues I
spotted with the original implementation.
The current patch still also supports the current system. For the final
version I would suggest removing that (in practice: remove the "old" Phase
III code). I'd also suggest to remove the --split option and instead just
hardcode the number of levels in a variable in the script.
I've done a fair amount of testing and the results looks good to me.
I would suggest delaying implementation of the patch until after the Beta1
release, but it would be great if you could test this a bit too.
The way I have tested this is:
$ cd <d-i dir>
# Make sure there are no pending changes!
$ for i in 1 2 3 4 5; do mkdir packages/po/sublevel$i; done
# Prepare for conversion (repeat the following 3 commands to revert
# to the initial state):
$ svn revert -R packages/
$ cp packages/po/*.po packages/po/sublevel1/
$ rm -f packages/po/sublevel[2345]/*
# Initial conversion run:
$ <path>/l10n-sync --noupdatepo --force --split=5 --convert `pwd`
# Do translation updates etc, then do a "normal" run:
$ <path>/l10n-sync --noupdatepo --force --split=5 `pwd`
# Clean up after testing:
$ svn revert -R packages/
$ rm -f packages/po/sublevel*/*
Cheers,
FJP
commit 8b85f434878442c8307b8c5bb14baf7105b5393e
Author: Frans Pop <fjp@debian.org>
Date: Sat Jan 12 00:53:59 2008 +0100
Improve multi-level handling
Main change is an improved method for updating from sublevel PO files.
Characteristics of the new method:
- preserve the headers in sublevel PO files (only POT-Creation-Date
is updated)
- the PO-Revision-Date and Last-Translator for the most recently updated
sublevel PO file are used to update PO files for individual packages
- if the level of a string changes, the existing translation from the
old level is preserved
- obsolete strings are preserved in the sublevel1 PO file in case strings
are reintroduced later (translators should remove them occasionally)
A temporary '--convert' option has been added to facilitate conversion
from the current master PO files to multi-level PO files.
Other changes
- Introduce new functions to determine header values for most
recent updated PO or POT file.
- Remove custom PO file headers (X-*) from merged PO files
before updating translations in packages directories to avoid
cluttering them.
diff --git a/scripts/l10n/l10n-sync b/scripts/l10n/l10n-sync
index 9601588..0e573f0 100755
--- a/scripts/l10n/l10n-sync
+++ b/scripts/l10n/l10n-sync
@@ -18,6 +18,7 @@ NUMLEVELS=1
UPDATEPO=Y
SYNCPKGS=Y
QUIET=N
+CONVERT=N
svn=svn
debconfupdatepo=debconf-updatepo
@@ -145,6 +146,73 @@ criticalerr() {
exit 3
}
+
+po_last_updated() {
+ local key files file lastfile lastdate tdate
+ key=$1
+ shift
+ files="$*"
+
+ lastdate=0
+ for file in $files ; do
+ tdate=$(date -d "$(grep "^\"$key:" $file | \
+ sed 's/^.*: \(.*\)\\n.*$/\1/')" "+%s")
+ if [ $tdate -gt $lastdate ] ; then
+ lastdate=$tdate
+ lastfile=$file
+ fi
+ done
+ echo "$lastfile"
+}
+
+# Get the whole line
+# The --no-wrap is needed because translator can span more than one line
+# The last sed statement is needed to preserve the \n at the end
+po_get_header() {
+ local key=$1
+ local file=$2
+ msgattrib --no-wrap $file | grep "^\"$key:" | sed 's/^.*: \(.*\)\\n.*$/\1/'
+}
+
+# Replace a header with a new value
+# The complex sed expression is to allow for the fact that a header may span
+# two lines; spanning three lines is not supported
+po_replace_header() {
+ local key=$1
+ local value=$2
+ local file=$3
+ sed -i "/^\"$key:/ N; s/^\"$key.*\\\\n\"\(\n.*\|$\)/\"$key: $value\\\\n\"\1/" \
+ $file
+}
+
+# Print anything up to the first msgid (the header)
+po_print_header() {
+ awk 'BEGIN {found = 0}
+ /^msgid ""/ {found = 1}
+ /^$/ {if (found == 1) exit}
+ {print $0}' $1
+}
+
+# Print anything after the first msgid (the header)
+po_print_body() {
+ awk 'BEGIN {found = 0}
+ /^msgid ""/ {if (found == 0) found = 1}
+ /^$/ {if (found == 1) found = 2}
+ {if (found == 2) print $0}' $1
+}
+
+# Print obsolete strings
+po_print_obsolete() {
+# # Old "manual" version
+# awk 'BEGIN {found = 0; lead=""}
+# /^#~ msgid/ {if (found == 0) {found = 1; print lead}}
+# {if (found == 0) lead=lead"\n"$0}
+# /^$/ {if (found == 0) lead=""}
+# {if (found == 1) print $0}' $1
+
+ msgattrib --only-obsolete --width=79 $1 | po_print_body
+}
+
## Command line parsing
MORETODO=true
while $MORETODO ; do
@@ -189,6 +257,9 @@ while $MORETODO ; do
"--nolog")
LOG=""
;;
+ "--convert")
+ CONVERT=Y
+ ;;
"--"*)
echo "Illegal option: $1" >&2
usage
@@ -398,21 +469,19 @@ log "- Merge all package templates.pot files..."
if ! msgcat ${pots} >/dev/null 2>&1 ; then
svnerr
fi
-log_cmd --pass msgcat ${pots} | \
+log_cmd --pass msgcat $pots | \
sed 's/charset=CHARSET/charset=UTF-8/g' >$DI_COPY/packages/po/template.pot.new
# Determine the most recent POT-Creation-Date for individual components
# Include master templates.pot too so the timestamp will never be set back
-LASTDATE="$(
- for j in ${pots} po/template.pot; do
- date -ud "$(grep "POT-Creation-Date:" $j | sed 's/^.*: \(.*\)\\n.*$/\1/')" "+%F %R%z"
- done | sort | tail -n 1)"
+LASTDATE="$(po_get_header "POT-Creation-Date" \
+ $(po_last_updated "POT-Creation-Date" $pots po/template.pot))"
+
# We don't want all templates.pot files headers as we don't care about them
# So we merge the generated file with a simple header.pot file
if [ -f po/header.pot -a -s po/template.pot.new ] ; then
msgcat --use-first po/header.pot po/template.pot.new | \
- sed 's/charset=UTF-8/charset=CHARSET/g' | \
- sed "s/^.*POT-Creation-Date:.*$/\"POT-Creation-Date: $LASTDATE\\\n\"/" \
- > po/template.pot
+ sed 's/charset=UTF-8/charset=CHARSET/g' > po/template.pot
+ po_replace_header "POT-Creation-Date" "$LASTDATE" po/template.pot
rm po/template.pot.new
else
error "ERROR: no $DI_COPY/packages/po/header.pot file. Cannot continue."
@@ -465,14 +534,119 @@ if [ "$WITHLEVELS" = "Y" ] ; then
fi
log ""
+# Update PO files for sublevels:
+# 3a) Synchronize with D-I SVN
+# 3b) Merge the sublevel PO files into a master PO file
+# 3c) Update the master PO file from the master POT file as it will be used
+# to update package PO files
+# 3d) Update the sublevel PO files from this master PO file and the sublevel POT file
+# 3e) commit back the changed file
+log "Phase III: update master translation files"
+if [ "$WITHLEVELS" = "Y" ] ; then
+ cd $DI_COPY/packages/po
+ languages=""
+ for po in sublevel1/*.po ; do
+ lang=$(basename $po .po)
+ # Next line is just for quicker testing
+ #[ $lang = nl ] || continue
+ log "- $lang"
+ if [ ! -r PROSPECTIVE ] || \
+ ([ -r PROSPECTIVE ] && \
+ ! grep -q "^$lang[[:space:]]*$" PROSPECTIVE); then
+ languages="${languages:+$languages }$lang"
+ fi
+
+ log " - Merge sublevel PO files into master PO file and update..."
+ list=""
+ for i in $LEVELS; do
+ if [ -f sublevel$i/$lang.po ]; then
+ list="${list:+$list }sublevel$i/$lang.po"
+ fi
+ done
+ # Retain the date and translator of the last updated sublevel PO file
+ LASTFILE="$(po_last_updated "PO-Revision-Date" $list)"
+ LASTDATE="$(po_get_header "PO-Revision-Date" $LASTFILE)"
+ LASTTRANS="$(po_get_header "Last-Translator" $LASTFILE)"
+ msgcat --use-first $list >${lang}.po
+ po_replace_header "PO-Revision-Date" "$LASTDATE" $lang.po
+ po_replace_header "Last-Translator" "$LASTTRANS" $lang.po
+
+ # Update the master PO file (as it's used to update package PO files)
+ log_cmd --pass msgmerge --previous $lang.po template.pot >$lang.po.new || \
+ gettexterr
+
+ # Remember obsolete strings
+ OBSOLETE="$(po_print_obsolete $lang.po.new)"
+
+ # Optionally merge with PO files from a different source
+ # Strings from the other source are preferred!
+ # Should we disallow automatic commits for this?
+ # WARNING: NOT TESTED!!!
+ if [ -n "$MERGEDIR" ] && [ -r $MERGEDIR/$lang.po ]; then
+ log " - Merge with $MERGEDIR/$lang.po !!"
+ msgcat --use-first "$MERGEDIR/$lang.po" $lang.po.new \
+ >$lang.po.merge || gettexterr
+ log_cmd --pass msgmerge --previous $lang.po.merge template.pot | \
+ msgattrib --no-obsolete >$lang.po.new || gettexterr
+ rm $lang.po.merge
+ fi
+
+ # Clean up new master PO file
+ msgattrib --width=79 --no-obsolete $lang.po.new >$lang.po
+ rm $lang.po.new
+
+ # Update the sublevel PO files
+ # We keep its old header and only update the POT-Creation-Date
+ for i in $LEVELS; do
+ if [ -f sublevel$i/$lang.po ]; then
+ OLDHEADER="$(po_print_header sublevel$i/$lang.po)"
+ elif [ "$CONVERT" = Y ]; then
+ OLDHEADER="$(po_print_header $lang.po)"
+ fi
+ if [ -f sublevel$i/$lang.po ] || [ "$CONVERT" = Y ]; then
+ log_cmd --pass -m " - Merge with template.pot for sublevel $i..." \
+ msgmerge --previous $lang.po \
+ sublevel$i/template.pot \
+ >sublevel$i/$lang.po.new || gettexterr
+ POTDATE="$(po_get_header "POT-Creation-Date" sublevel$i/$lang.po.new)"
+
+ # Combine old header and new content
+ ( echo "$OLDHEADER"
+ po_print_body sublevel$i/$lang.po.new ) | \
+ msgattrib --width=79 --no-obsolete \
+ >sublevel$i/$lang.po
+ po_replace_header "POT-Creation-Date" "$POTDATE" sublevel$i/$lang.po
+ # Append any obsolete strings to sublevel1 PO file
+ if [ $i -eq 1 ] && [ "$OBSOLETE" ]; then
+ echo "$OBSOLETE" >>sublevel$i/$lang.po
+ fi
+ rm sublevel$i/$lang.po.new
+ fi
+ done
+
+ # Remove all custom headers so they don't clutter the PO files in
+ # the packages directories
+ msgattrib --no-wrap $lang.po | \
+ grep -v "^\"X-.*: .*\\n\"$" | \
+ msgattrib --width=79 >$lang.po.new
+ mv $lang.po.new $lang.po
+ done
+
+ if [ "$COMMIT" = "Y" ] ; then
+ log_cmd -p "Commit all general PO/POT files to SVN..." \
+ $svn commit -m "$COMMIT_MARKER Updated packages/po/* against package templates" || svnerr
+ fi
+fi
+
# For each PO file in packages/po/sublevel* or packages/po:
# 3a) Synchronize with D-I SVN
# 3b) Update with template.pot
# 3c) Grab translations from the lower levels file(s)
# 3d) commit back the changed file
-log "Phase III: update master translation files"
for i in $LEVELS; do
if [ "$WITHLEVELS" = "Y" ] ; then
+ # Bail out; work has already been done in previous section
+ break
dir=po/sublevel$i
level="level $i "
else
@@ -540,26 +714,9 @@ for i in $LEVELS; do
$svn commit -m"${COMMIT_MARKER} Updated packages/$dir/* with general template.pot" *.po template.pot || svnerr
fi
done
-
-# If we use levels, create a temporary general file
-# (which we won't commit) to make merging in individual packages
-# much faster
-if [ "$WITHLEVELS" = "Y" ] ; then
- cd $DI_COPY/packages/po
- for po in sublevel1/*.po ; do
- lang=$(basename $po .po)
- list=""
- for i in `seq $NUMLEVELS -1 1`; do
- if [ -f sublevel${i}/${lang}.po ]; then
- list="$list sublevel${i}/${lang}.po"
- fi
- done
- msgcat --use-first $list >${lang}.po
- done
-fi
log ""
# Loop over D-I packages:
@@ -578,10 +735,10 @@ if [ "$SYNCPKGS" = "Y" ]; then
for lang in $languages ; do
logn "$lang "
cat >$lang.po.new <<EOF
-# THIS FILE IS AUTOMATICALLY GENERATED FROM THE MASTER FILE:
-# packages/po/$lang.po
+# THIS FILE IS GENERATED AUTOMATICALLY FROM THE D-I PO MASTER FILES
+# The master files can be found under packages/po/
#
-# DO NOT MODIFY IT DIRECTLY: SUCH CHANGES WILL BE LOST
+# DO NOT MODIFY THIS FILE DIRECTLY: SUCH CHANGES WILL BE LOST
#
EOF
log_cmd --pass msgmerge $DI_COPY/packages/po/$lang.po templates.pot | \
@@ -599,22 +756,21 @@ EOF
egrep -v "$filter" $lang.po >$oldfiltered
egrep -v "$filter" $lang.po.new >$newfiltered
if [ -z "$(diff $oldfiltered $newfiltered)" ] ; then
- # Don't commit if the only chages are in filtered lines
+ # Don't commit if the only changes are in filtered lines
rm $lang.po.new
else
+ # Remember original PO-Revision-Date
+ LASTDATE="$(po_get_header "PO-Revision-Date" $lang.po)"
+ mv $lang.po.new $lang.po
# At least one unfiltered line changed
# Put the old Revision-Date back if asked for
- if [ "$KEEP_REVISION" != "N" ] && [ "$KEEP_REVISION" = "$lang" ] ; then
- # Grab back the PO-Revision-Date from the old file
- old_revision=`grep -e "^\"PO-Revision-Date:" $lang.po | sed 's/\\\\n\"//g'`
- # And replace the one from the new file with it
- # then put all this as a result
- sed "s/\"PO-Revision-Date:.*/$old_revision\\\\n\"/g" $lang.po.new >$lang.po
- rm $lang.po.new
- log_s1 "${package}/debian/po/${lang}.po" "CHANGED, revision kept"
+ if [ "$KEEP_REVISION" != "N" ] && \
+ [ "$KEEP_REVISION" = "$lang" ] ; then
+ # Restore original PO-Revision-Date
+ po_replace_header "PO-Revision-Date" "$LASTDATE" $lang.po
+ log_s1 "$package/debian/po/$lang.po" "CHANGED, revision kept"
else
- mv $lang.po.new $lang.po
- log_s1 "${package}/debian/po/${lang}.po" "CHANGED"
+ log_s1 "$package/debian/po/$lang.po" "CHANGED"
fi
fi
# Remove temporary files
Attachment:
signature.asc
Description: This is a digitally signed message part.