Re: XLIFF tools
On 27/12/2005, at 12:55 PM, JC Helary wrote:
an XLIFF editor with filters for SGML based formats (HTML, XML,
Docbook, OpenOffice) and PO. It also provides a Translation Memory
server (which can use TMX files), and a tool to convert XLIFF
My understanding is that their support fo XLIFF files is still not
full. But the project is promissing (previously known as evil-
trans, name changed to make sure the project gets mainstream
Gee, I wonder why... :D
There are dependancies that don't work very well with OSX right now
but that is not relevant for Debian.
Relevant to me, though. :(
Never assume everyone here (only/) uses Debian, fellow OSX user. ;)
Also, Debian runs on PowerPC, so there are common hardware and
operation issues there.
A translator GUI. It supports tag based formats (HTML,
plain text. It can use and generate a TMX. It generate the
documents and the XLIFF.
I'm not sure it can be packaged for Debian (requires Swing,
Also the dependency chain is quite important.
OmegaT is not a XLIFF editor. The next version (RC5 was release on
the 24th :) supports any level of TMX and passed the Lisa
compliance test. LISA maintains the TMX standard, OASIS the XLIFF
Exported files are _not_ XLIFF.
OmegaT also supports Java properties bundles so that Java apps
(including OmegaT itself) can be localised in OmegaT.
We (I am member of the dev team as tester/localiser/documentation
maintainer/user support/OSX bundle maintainer :) are working on
bilingual source files support: po/xliff/etc as well as DocBook and
generic xml source files (any helper is welcome btw.)
That would be terrific. I'm currently looking for something that will
easily convert docbook etc. xml files to PO so I can translate them.
I had trouble understanding the po4a docs. I really need something I
can install and use very simply.
As for the Apple's jar thing I am not sure it is _required_. If it
is, it is only for use on OSX and thus irrelevant to Debian.
As for the SWING thing, you are correct.
It seems this one don't have the same dependencies than OmegaT. It
should be easier to package on Debian. I don't know if it is as
complete as OmegaT (or maybe more complete).
Just for your info. The developer used to work in OmegaT as a
documentation maintainer, he broke all out previous translation
memories and eventually left the project to create his fork since
he could not stand being explained translation related things by
people who were not able to produce code (besides for the fact that
the OmegaT lead developer is a translator and has done the Eclipse
interface to Russian using OmegaT...) It may be that in the future
this fork will create good code, but the user/tester base is
inexistant. Right now it has yet to produce anycode.
Thanks for pointing that out. The main Omega-T project has always
produced well, and reacted positively and creatively to input or
questions. I speak from experience, as I was on the mailing list for
a while (until I had to reduce activity), and I reported the bug
about processing PHP files, which was addressed promptly and thoroughly.
* SUN's open-language-tools
It is licensed under CDDL. It can't entre main currently. I
if it could be shiped in non-free in Debian.
I've not tried it.
SUN had an in-house Java tool used for their own translation
process. It is a pure XLIFF editor using their in-house generated
XLIFF files (OOo has been localised with STE).
The package has been released under the name OpenLanguageTools. It
works well, I don't know the dependancies except for it being a
Now a number of OOo Language Native groups are working with OmegaT
on documentation translation for ease of use (and quality of user
support :) Although GUI files are still worked on with XLIFF files.
What doc format are the OOo groups using?
With xlifftool, we can probably convert any PO to create an XLIFF
However, I'm not sure we can do the reverse conversion with any XLIFF
That's something to test.
Reverse conversion is a sticky point for many "conversion" tools...
Also, did you encountered any issue due to providing POs to
LocFactoryEditor? Would the translations be easier if we could
XLIFF files instead of POs?
I use LocFactoryEditor (I am working on the French localisation
right now). LFE's po/xliff support is equivalent.
The advantage of xliff files is that, as you noticed yourself,
there are plenty of solid options outside the Linux world while .po
files are restricted to Fink's kbabel/gtranslator or LFE.
Plus POEdit (still not ported to OSX), and emacs PO mode for you
emacs-devotees out there :)
Knowing that kbabel/gtranslator are X11 apps (even on OSX) there
are character input systems issues:
That's a very polite way to express it. <fume>
everything need to be set up especially to get either of those 2
apps to work and it is a pain in the butt.
Most definitely. :(
I never succeeded in getting gtranslator installed, and although I
finally got kbabel up and open, I couldn't input Vietnamese at all.
This, I realized, was somewhat of a barrier to translation...
The only valid option on OSX is LFE but it is not a free (either
way) application even if the demo mode is working without any
annoying limitation (a bunch of function limitations though).
The PO-only editor is free. It was created especially for open-source
The paid editor was originally created for professional translators,
who use the XLIFF tools, so if you want those as well as the PO
functions, you buy the full version. It's certainly affordable,
especially in comparison with things like Heartsome and Wordfast,
heavily-advertized professional translation editors.
For contributions to Debian form outside the Debian world, xliff
files would be appreciated.
It would certainly encourage more professional translators to
contribute to open-source i18n.
If you want to tests this softwares, I've made Debian packages for
Transolution, xlifftool and XML-TMX: https://nekral.homelinux.net/
(online when I'm not sleeping)
Omegat doesn't need to be installed (java -jar omegat.jar should be
sufficient to test it).
You are right. OmegaT was first of all designed as an app that
works crossplatform because of the lack of translator's tools on
Linux. It is supposed to work perfectly well on Linux. The recent
RC includes preference files set as each platform demands: in
~/.omegat on Linux.
We still need to work a lot for full Linux "acceptance" but there
is a developer who ported OmegaT 1.4.5 to Gentoo if you are
1.4.5 will be obsoleted as soon as 1.6 is released though (sometime
I think more conversion features will introduce Omega-T to a larger
Generally speaking, my earlier comments on the .po format are
limited by my lack of familiarity with the format. I apologise for
any misunderstanding born from my mails.
Your experience is very welcome and useful. :)
I think though that it is clear that .po as well as .xliff are both
_localisation_ oriented formats while .xliff also provides native
support for documentation translation. In short, it is
_conceptually_ easier to fully localise an application (GUI/Docs in
various formats) by using an exclusively .xliff based process than
by using an exclusively .po based one.
Efficiency is not the only value involved, however. If people are
used to the PO format, and it forms the major part of their workflow,
then it has mindshare: adopting another format, however useful, is
extra work and thus less attractive, especially when people are
trying to fit in time for translation. That's why conversion tools
are likely to attract more translators: come one, come all, pick your
favourite format and translate!
Tweaks being slowly avaliable to provide .po<->.xliff conversions
both formats could be considered equivalent for a subset of
functions provided by .xliff though. If that subset is enough for
Debian it should be a satisfying option.
It will be a big advantage to have conversion tools (like LFE for
Mac) which can convert between translation compendia and TMX
translation memory. Linking in to the current standards for the
industry can only benefit open source i18n.
As far as translation management is concerned though, it seems to
me that translation variants are better handled by xliff and this
specific item should greatly enhance the translation process in
Debian if properly implemented:
In a <tu> it could be possible to have previous versions <tuv> that
do not differ in meaning but only in structure (either spellcheck
or syntax, punctiation corrections) and those could be set to be
equivalent to the target language <tuv> without any trouble, what
seems (?) to be difficult right now in a .po setting.
The fact that xliff interacts fully with other translation
standards (tbx for glossaries, srx for segmentation, tmx for tm
exchange) greatly enhances the translator's experience and allows a
Debian localiser to easily leverage her/his work with external
sources that would require quite a lot of fiddling right now, while
keeping the output result consistent with the Debian po based
There's certainly a lot of extra function there, but it won't happen
unless people with those skills start contributing, or current
contributors acquire those skills. We do the best we can at the time
with what we have most conveniently available.
If gettext were not involve at all, a fully xliff based process
would be valid. Since gettext is at the core of Debian,
inevitably .po comes in and the localisation functions of .xliff
are thus redundants.
We need to bear in mind the essential rôle gettext plays in
extracting translatable strings from applications written in
different languages. This aspect of internationalization is fully as
complex as translation. Converting that process to xliff or any other
standard would be a huge task.
Is it valid to use .xliff for documentation only and .po for GUI
work only ? Is it possible to create TMs from the GUI work that can
be used with the Documentation work ?
Since LFE can convert PO compendia to TMX, certainly. Or do we mean
Is it possible to work with multiple files and file formats on one
localisation project to have trasnparent access to the whole data
set (like OmegaT does) ?
This is certainly my dream. I dream of logging into something like
Pootle or the Translation Project and having _all_ my files, from all
my projects, available, current and accessible to translate/update
and commit/submit. All from one interface. :)
Dashing from one project to another, learning all the different
procedures, checking all the different currency displays every day,
keeping up with all the different mailing lists, is extremely time-
consuming (worse, for me, confusing), and I would much rather spend
that time actually __translating__.
At last count I was actively translating for 5 multi-application
projects and 9 individual projects. That's probably very little
compared to some people, but co-ordinating it all takes up way too
much of my available translating time. :(
I am not sure all the above questions are relevant since I don't
know the current process but it seems to be they are questions that
occur in any multi-file format translation process within the free
and non free world equally...
Your questions are perfectly viable. It's only by questioning
previous procedures that we've decided to develop the ones we have
from Clytie (vi-VN, Vietnamese free-software translation team / nhóm
Việt hóa phần mềm tự do)