Bug#564915: debian-edu-doc: improve xml generation from wiki source
Package: debian-edu-doc
Severity: wishlist
Tags: patch
Hi,
on doing some translation work of the debian-edu manual *.po files I noticed
some extra space in tagged text like:
<emphasis>still incomplete </emphasis> instead of
<emphasis>still incomplete</emphasis>
or
<computeroutput>2010-01-12 </computeroutput> instead of
<computeroutput>2010-01-12</computeroutput>.
If not corrected in the translation this can give nasty spaces or single
full-stops on single new lines in the translations.
The space is introduced in the conversion by a sed "s%<\/%\n<\/%g" command
which breaks lines in the xml-file after the close-tag. A solution might be to
break lines there only for specific tags (see patch).
The downside is, that after correction fuzzy translations in the *.po files
increase, which have to be checked.
But if we want to do it, we should do it the earlier the better.
Cheers,
Andi
-- System Information:
Debian Release: squeeze/sid
APT prefers testing
APT policy: (500, 'testing')
Architecture: amd64 (x86_64)
Kernel: Linux 2.6.32-nouveau.git (SMP w/2 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Index: ../scripts/get_manual
===================================================================
--- ../scripts/get_manual (revision 61272)
+++ ../scripts/get_manual (working copy)
@@ -61,10 +61,12 @@
sed "s#</articleinfo>##g" |
sed "s#</revision>##g" |
sed "s%<para><ulink url='http://wiki.debian.org/CategoryPermalink#'>CategoryPermalink</ulink> </para>%%" |
- sed "s%<\/%\n<\/%g" |
sed "s%<title>%\n<title>%g" |
+ sed "s%<\/title>%\n<\/title>%g" |
sed "s%<section>%\n\n<section>%g" |
+ sed "s%<\/section>%\n\n<\/section>%g" |
sed "s%<para>%\n<para>%g" |
+ sed "s%<\/para>%\n<\/para>%g" |
sed "s%</date>\(.*\)\$%%g" |
sed "s%FIXME%\nFIXME%g" |
sed '1,4d' > $TARGET
Reply to: