[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

XLIFF tools (was: Work on a centralized infrastructure for i18n/l10n)



On Fri, Dec 23, 2005 at 01:46:04PM +0900, JC Helary wrote:

> Sounds like a lecture for freshmen CAT tool users but go ahead.

The point is not in making a lecture.
I'm developping a string extraction tool. I would like to know if it is
at the string extraction step that there are advantages in the tools you
use.
If the advantages are at the storage step or in the translator GUI, I
won't (can't) do anything.
(If you wish an XLIFF module in po4a, just fill a wishlist bug, I will
probably do it)

> >Thus IMO, advantages of the XLIFF format should be demonstrated by
> >considering XLIFF as the database.
> >I won't consider having good translation tools or string extraction  
> >tools
> >that deal with XLIFF files as an advantage of the XLIFF format.
> 
> Well, then, if you won't why bother going that far trying to  
> demonstrate that anything is just as good as anything else ?
> 
> >My prefered translation tool is vi. Thus, as a translator, I prefer  
> >PO to
> >XLIFF or a mysql database.
> 
> And here lays the problem: you consider vi as a translation tool. Now  
> tell me, how many people who do translations on a daily basis would  
> consider vi as a translation tool ?

I've seen a lot of people editing POs with only text editors (vi, notepad,
MS word). Of course they are missing a lot of features (like Translation
Memories), but they exist and are doing a good job.
(And if translators using XLIFF files exist, it could be interresting to
also generate XLIFF files)

> What I am saying (maybe on my 3rd or 4th mail) is that opening the  
> translation process to people outside the GNU/Linux world could be  
> the result of adopting a translation industry standard that benefits  
> the end translator because more tools exist with more options on more  
> platforms with less nerdness to overcome before actually starting to  
> translate. And since that end user format _happens_ to be handled as  
> a TM management format as well (and can easily be transformed to the  
> TM "exchange" format that is TMX also an industry standard) why not  
> use that format as the storage format. That would save transformations.

So we may need good PO to XLIFF tools.
Does anybody know of such tools?
None of them seem to be packaged in Debian. The only XLIFF related tool is
transolution - still a RFP (Request For Package).

I've seen on http://xliff-tools.freedesktop.org/wiki/Download that there
are two set of tools (written in C and Java). Both can be installed on
Debian (for the Java based, two JAR are used. I've not checked their
licence, and I don't know if their sources are available).

The C based toolsuite contains: po2xlf xlf2po xlfpoinit xlfpomerge.

Does anybody use them?
Are the java tools better? (the C tools are not released; they are only
available on the CVS; they seem to work, but they were not changed for 6
months, since the initial import)

Is there any XLIFF command line tools like the msg* tools?


> >It seems you like to have sentences separated. This is not related to
> >using XLIFF. This is a string extraction issue.
> >Note however that most of the complaints I receive for po4a (about its
> >strings extraction features) are that there is too few context in the
> >strings proposed to the translators, not that paragraphs should be  
> >split
> >in sentences.
> 
> Now you are talking about another problem: adequation of the format  
> to the task. And I think your translators are very much aware of  
> that: sentence translation is not appropriately handled by po based  
> tools. Just like you mention in your other mail: multilingual bodies  
> are not either properly handled by po based tools.

This is not a format issue.
If a sentence segmenter was requested for POs, it would be developped.

(If anybody wants a sentence segmenter, using SRX definitions, just fill a
wishlist bug in po4a. It can probably be done easily with its Po module.
This could be usefull to speed the translations, using translation
memories, but I don't think this will improve the translations)

> Context was an issue in the translation world before po ever existed  
> and before computer guy realized there was a need for localizing  
> their strings. And the proof that they did not quite get it the right  
> way is the character set issues that eventually are starting to get  
> solved with Unicode being pretty much generally accepted.

I'm not sure we are talking about the same thing.

How do you know how to translate "She is moving!" in French, without
knowing if "She" is a baby (whose sex is not known) or a woman.
You need context, and thus some translators prefer translating paragraphs
rather than sentences.
I don't deny that some others may prefer smaller strings to improve the
reuse of strings and speed up the translations.

> >It seems you prefer XLIFF for HTML translations. I don't know why,  
> >but it is
> >probably not related to the XLIFF format. Maybe it is just that the  
> >tool
> >you use with your XLIFF files is better than what a PO tool would have
> >done (at what step: string extraction, translation?).
> 
> No I don't prefer xliff for html transformations, I don't transform,  
> I translate.

Then I don't understand how the translation process of a project should
works. Will you just send the translated file? Do you also send a TM file?


> >One feature I don't like in XLIFF is having multiple languages in  
> >the same
> >XLIFF file. This was proved wrong with the debconf translations  
> >(multiple
> >translation updates can't be committed).
> 
> 1) a xliff file is not required to have more than 2 translation unit  
> variants.
> 2) if you use formats that are not designed to support multiple tuv  
> chances that the result will not be satisfying.

I don't understand 2). Or maybe I've not understood what the variants are
(Deutch and Chinese or French from France and French from Canada?)

My point is that if the format exchanged by the developpers and the
translators contain multiple languages, there is a great chance that the
developpers won't accept patches (or will ask all the translators to
centralized the translations before submitting the patches).

> Joyeux Noël anyway :)

Merry Christmas to you too, and to all the list.

-- 
Nekral



Reply to: