[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Filtering out some untranslatable strings



[ Petter Reinholdtsen, 2020-08-09 ]
> [Frans Spiesschaert]
> > It only removes the image name part, not the alt name part for the image.
> >
> > Example:
> >
> > This is an image paragraph in debian-edu-bullseye-manual.xml:
> > <para><inlinemediaobject><imageobject><imagedata
> > fileref='./images/i.png'/></imageobject><textobject><phrase>Debian Edu
> > Installer Logo</phrase></textobject></inlinemediaobject>
> 
> 
> I got the impression that the English images would be named i.png, while
> for example the German edition would use i-de.png or something like
> that.  At least I believe that is the reason these image path strings
> are passed on to the translators.
 
The replacement of original images (as shown on the wiki pages) with 
language specific ones works differently. If an image with exactly the 
same name as the original one is available in <manual_dir>/images/<lang_id>/
it is used to replace the original one in <manual_dir>/images/.

So excluding image path and names from translation (i.e. avoid the need 
for translators to copy them) is worth thinking about. Problem is that 
the Docbook export contains the image path/filename plus an alt tag 
(<phrase> in XML). Even if an alt tag isn't provided on the wiki, the 
filename is put in as <phrase> content.

I've just tested the snippet that Frans has provided (slightly modified 
version copied to the end of scripts/get_manual).

The script produces the stripped XML file like expected and also generates 
the POT and all PO files, if 'make update' is run in a manual directory 
including po4a.cfg containing a manual specific pot_in line.

So far, so good.

But then, nothing seems to be gained: running 'make status' shows a lot 
of fuzzy and even a few untranslated strings for PO files that before 
had been fully translated. It seems that the special wiki icons like 
'alert' and 'sad' are mostly responsible. More work needed, I guess.

To reproduce for e.g. the bullseye manual:

(1) copy the attached snippet to the end of 'scripts/get_manual'
(2) cd into the bullseye manual directory
(3) run 'make status' to get the translation status
(4) add a pot_in line to po4a.cfg (file attached)
(5) run 'make update' to get the source from the wiki and generate files.
(6) run 'make status' again and compare the output to the previous one.

To revert the changes, run 'git checkout *.po && git checkout *.pot' 

Wolfgang
# create $stripped_xmlfile which will come with some non-translatable strings
# removed and will be used for POT and PO file creation via Po4a.
# ---remove untranslatable image names--- #
stripped_xmlfile=$name-stripped.xml
echo "removing image names"
sed -e 's#<imagedata.*</imageobject>#</imageobject>#g' $xmlfile > $stripped_xmlfile
# ---remove paragraphs that just have a <ulink> and no other text--- #
echo "removing link paragraphs"
    #---# first copy those paragraphs to a tempfile #---#
    TMPFILE3=$(mktemp)
    cat $xmlfile | sed -n '/^<para><ulink/p' | sed -n '/> *$/p'  > $TMPFILE3
    #---# then replace those links with an empty string #---#
    #---# and keep only the <para> tag to prevent xml from being broken #---#
    while read line ;
        do sed -i "s#$line#<para>#" $stripped_xmlfile
        done < $TMPFILE3
# ---remove FIXME: paragraphs--- #
# ---(currently that colon is missing in some FIXME paragraphs)--- #
echo "removing FIXME: paragraphs"
sed -i '/^FIXME:/d' $stripped_xmlfile

[po_directory] .

[type: docbook] debian-edu-bullseye-manual.xml \
	pot_in:debian-edu-bullseye-manual-stripped.xml \
	$lang:$lang.xml \
	add_$lang:?./$lang.add \
	opt:"-M UTF-8 -k 15"

Attachment: signature.asc
Description: PGP signature


Reply to: