[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

mass changing script for Debian package description translations (DDTP)



Hi,

some time ago I was asked for my solution to fix errors in DDTP descriptions.
Since my hard disk was locked I could not respond until now (as I fixed my
problem). If you don't know DDTP (ddtp.debian.net) you may skip this mail.

This mail explains all necessary steps and includes the used script/program.
I will no longer hide it but assume that all possible users are careful enough
not to destroy DDTP data ...

Procedure:

Download from http://ddtp.debian.net/debian/dists/sid/main/i18n/
a Translation-<lang> file and create a copy of it (suffix .orig). (Do use
"sid" only!) After fixing most errors in the file (don't insert or remove
descriptions, keep them in sync with *.orig!) sent the changes via email
interface back to the DDTP:

$ make
$ rm -ri .tmp/
$ ./UpdateTranslations ../Translation-de.2007.09.30.orig ../Translation-de.2007.09.30
Detected language: de
............
The package gkrellm-ibam has multiple active IDs!

gkrellm-ibam not found, skipping ...
.....
The package ibam has multiple active IDs!

ibam not found, skipping ...

Please note that calling UpdateTranslations is safe. It just writes email
attachments and log files to .tmp/. It doesn't send any file!

$ ls .tmp/
attachment1 ... attachment47
$ ./createMail 1 47 > sent/mail.2007.09.30_1..47

This creates an email template which you can send to DDTP. As mutt has some
bugs (#434235 and #434236) I send sent/mail.2007.09.30_1..47 using the
following workaround:

$ mutt -H sent/mail.2007.09.30_1..47
(Change To: to your local account (e.g. jens@localhost) and send the mail).
Reopen mutt to edit the recieved mail:
$ mutt
Press "e" and change the header of the new mail from
 Content-Type: text/plain; charset=iso-8859-1
into
 Content-Type: multipart/mixed; boundary="1yeeQ81UyVL57Vl7"
NOW RECODE THE FILE TO UTF-8 (via :set fileencoding=utf-8|:wq if vim editor is
used) and you have a proper mail with many attachments. Check these
(encoding==base64?, proper "Description-<lang>.UTF-8: " line?, matching
number of paragraphs in translation compared to English description?)
and resend (bounce: "b") this mail to pdesc@ddtp.debian.net.  That's
all!

Comments:
* Do not send corrections if you don't have permission to do so. Ask
  your language team before doing so.
* If you know an easy way to attach many files to a mail (properly base64
  encoded!) you do not need to start ./createMail but can directly attach
  .tmp/attachments*. mutt should allow it but doesn't!
* Check this procedure the first time on two or three descriptions only and
  compare the next day the Translation files. Is everything OK? Do not use it
  if you have doubts ...
* The manual checks of the mail attachments are not necessary but I strongly
  suggest it, especially the first time! Once the DDTP changes same parts of
  the website many internal tests in UpdateTranslations should ensure that no
  broken data is created. But please note that I cannot guarantee it!
* It worked for me very well and I found some errors in DDTP this way (e.g.
  multiple active descriptions).
* If the fixed translation was already altered in DDTP the program detects it
  and ignores your correction. You may want to try it the next day again
  (using an updated Translation file).

Possible errors suitable for mass changing:
* line breaks in middle of "very-
  long words" (should be "very-long words")
* Common typos

I used it successfully to fix thousends or maybe even ten thousends of
errors in the German translation.

Have fun,
Jens

Attachment: UpdateTranslations.cpp.gz
Description: Binary data

all: UpdateTranslations

UpdateTranslations: UpdateTranslations.cpp
	g++ -Wall -W UpdateTranslations.cpp -o $@

clean:
	rm -f UpdateTranslations
#!/bin/sh

# Creates a mail template using multiple attachments to be send to the DDTP.

if ! test "$#" -eq 2; then
  echo "Please specify which DDTP corrections should be send to the server.
  echo "Example: $0 1 4 Sends .tmp/attachment1 upto .tmp/attachment4
  echo
  exit -1
fi

from=$1
to=$2

cat <<EOF
To: <pdesc@ddtp.debian.net>
Subject: mass changing script, attachments $from .. $to
Content-Type: multipart/mixed; boundary="1yeeQ81UyVL57Vl7"
Content-Disposition: inline
Content-Transfer-Encoding: 8bit


--1yeeQ81UyVL57Vl7
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline



EOF
for i in $(seq $from $to); do
cat <<EOF
--1yeeQ81UyVL57Vl7
Content-Type: text/plain; charset=base64
Content-Disposition: attachment; filename=attachment$i
Content-Transfer-Encoding: 8bit

EOF
cat .tmp/attachment$i
echo
done

echo "--1yeeQ81UyVL57Vl7--"
echo

Reply to: