[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: XLIFF tools

On 27/12/2005, at 12:55 PM, JC Helary wrote:

 * Transolution
   an XLIFF editor with filters for SGML based formats (HTML, XML,
   Docbook, OpenOffice) and PO. It also provides a Translation Memory
server (which can use TMX files), and a tool to convert XLIFF to TMX.

My understanding is that their support fo XLIFF files is still not full. But the project is promissing (previously known as evil- trans, name changed to make sure the project gets mainstream acceptance... :)

Gee, I wonder why... :D

There are dependancies that don't work very well with OSX right now but that is not relevant for Debian.

Relevant to me, though. :(

Never assume everyone here (only/) uses Debian, fellow OSX user. ;)

Also, Debian runs on PowerPC, so there are common hardware and operation issues there.

 * OmegaT
A translator GUI. It supports tag based formats (HTML, OpenOffice) and plain text. It can use and generate a TMX. It generate the translated
   documents and the XLIFF.
I'm not sure it can be packaged for Debian (requires Swing, requires an
   Apple's .jar)
   Also the dependency chain is quite important.

OmegaT is not a XLIFF editor. The next version (RC5 was release on the 24th :) supports any level of TMX and passed the Lisa compliance test. LISA maintains the TMX standard, OASIS the XLIFF standard.

Exported files are _not_ XLIFF.

OmegaT also supports Java properties bundles so that Java apps (including OmegaT itself) can be localised in OmegaT.

We (I am member of the dev team as tester/localiser/documentation maintainer/user support/OSX bundle maintainer :) are working on bilingual source files support: po/xliff/etc as well as DocBook and generic xml source files (any helper is welcome btw.)

That would be terrific. I'm currently looking for something that will easily convert docbook etc. xml files to PO so I can translate them. I had trouble understanding the po4a docs. I really need something I can install and use very simply.

As for the Apple's jar thing I am not sure it is _required_. If it is, it is only for use on OSX and thus irrelevant to Debian.


As for the SWING thing, you are correct.

 * omegat+
   It seems this one don't have the same dependencies than OmegaT. It
   should be easier to package on Debian. I don't know if it is as
   complete as OmegaT (or maybe more complete).

Just for your info. The developer used to work in OmegaT as a documentation maintainer, he broke all out previous translation memories and eventually left the project to create his fork since he could not stand being explained translation related things by people who were not able to produce code (besides for the fact that the OmegaT lead developer is a translator and has done the Eclipse interface to Russian using OmegaT...) It may be that in the future this fork will create good code, but the user/tester base is inexistant. Right now it has yet to produce anycode.

Thanks for pointing that out. The main Omega-T project has always produced well, and reacted positively and creatively to input or questions. I speak from experience, as I was on the mailing list for a while (until I had to reduce activity), and I reported the bug about processing PHP files, which was addressed promptly and thoroughly.

 * SUN's open-language-tools
It is licensed under CDDL. It can't entre main currently. I don't know
   if it could be shiped in non-free in Debian.
   I've not tried it.

SUN had an in-house Java tool used for their own translation process. It is a pure XLIFF editor using their in-house generated XLIFF files (OOo has been localised with STE). The package has been released under the name OpenLanguageTools. It works well, I don't know the dependancies except for it being a Java app. Now a number of OOo Language Native groups are working with OmegaT on documentation translation for ease of use (and quality of user support :) Although GUI files are still worked on with XLIFF files.

What doc format are the OOo groups using?

With xlifftool, we can probably convert any PO to create an XLIFF file.
However, I'm not sure we can do the reverse conversion with any XLIFF

That's something to test.

Reverse conversion is a sticky point for many "conversion" tools...

Also, did you encountered any issue due to providing POs to
LocFactoryEditor? Would the translations be easier if we could provide you
XLIFF files instead of POs?

I use LocFactoryEditor (I am working on the French localisation right now). LFE's po/xliff support is equivalent.

Entirely. :)

The advantage of xliff files is that, as you noticed yourself, there are plenty of solid options outside the Linux world while .po files are restricted to Fink's kbabel/gtranslator or LFE.

Plus POEdit (still not ported to OSX), and emacs PO mode for you emacs-devotees out there :)

Knowing that kbabel/gtranslator are X11 apps (even on OSX) there are character input systems issues:

That's a very polite way to express it. <fume>

everything need to be set up especially to get either of those 2 apps to work and it is a pain in the butt.

Most definitely. :(

I never succeeded in getting gtranslator installed, and although I finally got kbabel up and open, I couldn't input Vietnamese at all. This, I realized, was somewhat of a barrier to translation...

The only valid option on OSX is LFE but it is not a free (either way) application even if the demo mode is working without any annoying limitation (a bunch of function limitations though).

The PO-only editor is free. It was created especially for open-source translators.

The paid editor was originally created for professional translators, who use the XLIFF tools, so if you want those as well as the PO functions, you buy the full version. It's certainly affordable, especially in comparison with things like Heartsome and Wordfast, heavily-advertized professional translation editors.

For contributions to Debian form outside the Debian world, xliff files would be appreciated.

It would certainly encourage more professional translators to contribute to open-source i18n.

If you want to tests this softwares, I've made Debian packages for
Transolution, xlifftool and XML-TMX: https://nekral.homelinux.net/ pootle/
(online when I'm not sleeping)
Omegat doesn't need to be installed (java -jar omegat.jar should be
sufficient to test it).

You are right. OmegaT was first of all designed as an app that works crossplatform because of the lack of translator's tools on Linux. It is supposed to work perfectly well on Linux. The recent RC includes preference files set as each platform demands: in ~/.omegat on Linux.

We still need to work a lot for full Linux "acceptance" but there is a developer who ported OmegaT 1.4.5 to Gentoo if you are interested:

1.4.5 will be obsoleted as soon as 1.6 is released though (sometime in January).

I think more conversion features will introduce Omega-T to a larger user base.

Generally speaking, my earlier comments on the .po format are limited by my lack of familiarity with the format. I apologise for any misunderstanding born from my mails.

Your experience is very welcome and useful. :)

I think though that it is clear that .po as well as .xliff are both _localisation_ oriented formats while .xliff also provides native support for documentation translation. In short, it is _conceptually_ easier to fully localise an application (GUI/Docs in various formats) by using an exclusively .xliff based process than by using an exclusively .po based one.

Efficiency is not the only value involved, however. If people are used to the PO format, and it forms the major part of their workflow, then it has mindshare: adopting another format, however useful, is extra work and thus less attractive, especially when people are trying to fit in time for translation. That's why conversion tools are likely to attract more translators: come one, come all, pick your favourite format and translate!

Tweaks being slowly avaliable to provide .po<->.xliff conversions both formats could be considered equivalent for a subset of functions provided by .xliff though. If that subset is enough for Debian it should be a satisfying option.

It will be a big advantage to have conversion tools (like LFE for Mac) which can convert between translation compendia and TMX translation memory. Linking in to the current standards for the industry can only benefit open source i18n.

As far as translation management is concerned though, it seems to me that translation variants are better handled by xliff and this specific item should greatly enhance the translation process in Debian if properly implemented:

In a <tu> it could be possible to have previous versions <tuv> that do not differ in meaning but only in structure (either spellcheck or syntax, punctiation corrections) and those could be set to be equivalent to the target language <tuv> without any trouble, what seems (?) to be difficult right now in a .po setting.

The fact that xliff interacts fully with other translation standards (tbx for glossaries, srx for segmentation, tmx for tm exchange) greatly enhances the translator's experience and allows a Debian localiser to easily leverage her/his work with external sources that would require quite a lot of fiddling right now, while keeping the output result consistent with the Debian po based localisation framework.

There's certainly a lot of extra function there, but it won't happen unless people with those skills start contributing, or current contributors acquire those skills. We do the best we can at the time with what we have most conveniently available.

If gettext were not involve at all, a fully xliff based process would be valid. Since gettext is at the core of Debian, inevitably .po comes in and the localisation functions of .xliff are thus redundants.

We need to bear in mind the essential rôle gettext plays in extracting translatable strings from applications written in different languages. This aspect of internationalization is fully as complex as translation. Converting that process to xliff or any other standard would be a huge task.

Is it valid to use .xliff for documentation only and .po for GUI work only ? Is it possible to create TMs from the GUI work that can be used with the Documentation work ?

Since LFE can convert PO compendia to TMX, certainly. Or do we mean "probable"? ;)

Is it possible to work with multiple files and file formats on one localisation project to have trasnparent access to the whole data set (like OmegaT does) ?

This is certainly my dream. I dream of logging into something like Pootle or the Translation Project and having _all_ my files, from all my projects, available, current and accessible to translate/update and commit/submit. All from one interface. :)

Dashing from one project to another, learning all the different procedures, checking all the different currency displays every day, keeping up with all the different mailing lists, is extremely time- consuming (worse, for me, confusing), and I would much rather spend that time actually __translating__.

At last count I was actively translating for 5 multi-application projects and 9 individual projects. That's probably very little compared to some people, but co-ordinating it all takes up way too much of my available translating time. :(

I am not sure all the above questions are relevant since I don't know the current process but it seems to be they are questions that occur in any multi-file format translation process within the free and non free world equally...

Your questions are perfectly viable. It's only by questioning previous procedures that we've decided to develop the ones we have today. :)

from Clytie (vi-VN, Vietnamese free-software translation team / nhóm Việt hóa phần mềm tự do)

Reply to: