[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#564915: debian-edu-doc: improve xml generation from wiki source



Package: debian-edu-doc
Severity: wishlist
Tags: patch

Hi,

on doing some translation work of the debian-edu manual *.po files I noticed 
some extra space in tagged text like:
<emphasis>still incomplete </emphasis> instead of
<emphasis>still incomplete</emphasis>
or 
<computeroutput>2010-01-12 </computeroutput> instead of
<computeroutput>2010-01-12</computeroutput>.
If not corrected in the translation this can give nasty spaces or single 
full-stops on single new lines in the translations.

The space is introduced in the conversion by a sed "s%<\/%\n<\/%g" command 
which breaks lines in the xml-file after the close-tag. A solution might be to 
break lines there only for specific tags (see patch).   
The downside is, that after correction fuzzy translations in the *.po files 
increase, which have to be checked. 
But if we want to do it, we should do it the earlier the better. 

Cheers,

	Andi

-- System Information:
Debian Release: squeeze/sid
  APT prefers testing
  APT policy: (500, 'testing')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.32-nouveau.git (SMP w/2 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Index: ../scripts/get_manual
===================================================================
--- ../scripts/get_manual	(revision 61272)
+++ ../scripts/get_manual	(working copy)
@@ -61,10 +61,12 @@
 	sed "s#</articleinfo>##g" |
 	sed "s#</revision>##g" |
 	sed "s%<para><ulink url='http://wiki.debian.org/CategoryPermalink#'>CategoryPermalink</ulink> </para>%%" |
-	sed "s%<\/%\n<\/%g" |
 	sed "s%<title>%\n<title>%g" |
+	sed "s%<\/title>%\n<\/title>%g" |
 	sed "s%<section>%\n\n<section>%g" |
+	sed "s%<\/section>%\n\n<\/section>%g" |
 	sed "s%<para>%\n<para>%g" |
+	sed "s%<\/para>%\n<\/para>%g" |
 	sed "s%</date>\(.*\)\$%%g" |
 	sed "s%FIXME%\nFIXME%g" |
 	sed '1,4d' > $TARGET

Reply to: