Re: Google summer of code: i18n infrastructure
> Can you make your complete proposal public too? Maybe there are some
> important ideas that I have not considered. As I mentioned, I see it
> as my goal to collect all the different ideas floating around ;)
> Thanks.
I'm putting diagrams online:
http://www.kiberpipa.org/~hruske/stuff/debian-i18n/
Comment to bigpicture.png:
Does message gateway hold actual messages and programs pull translations from
it or is it just an API proxy with the actual program repositories holding
messages. With storage, it brings up questions how to make messages
synchronized at any time, without storage it brings some additional load on
subversion server.
Statistics can be generated at this point since all messages pass through this
gateway.
Translation servers are not hosted by Debian, but rather by individual
translation teams or groups of teams. This also shifts the burden of hardware
demands away from Debian, all we need to do is provide a really good web
translation software, that's capable translating more than Debian, but Debian
servers will only provide translation for Debian generated strings if there's
no translation server registered for specific language.
By providing good web translation portal Debian not only helps itself, but
also projects like GNOME, KDE and other, because translators for specific
language would only need one interface for all the opensource projects and
message handling overhead is greatly diminished since the translator can
easily few fuzzy strings with web interface while it he would need to use the
old-fashioned checkout-translate-send to reviewer-reviewer sends to project
i18n admin-admin commits process ... he would probably not do it.
Comment to transserver.png:
Untranslated messages come to storage where they wait for translator. When a
translator goes translating it, it is passed through translation memory,
through coordination robot to let him know this translation is being
translated and then fed to web interface or packed into mail message or
downloaded with subversion. When messages are translated, they get submitted
back for review. If review for a message is good, message gets stored and
translation is added to translation memory for further use. Reviewed messages
get handed to coordination robot to free the translation for further use and
after that submitted back to storage. When enough messages get translated or
upstream project's freeze is close, project admin tells the server to submit
messages to upstream.
Web interface has spellcheck, dictionary, format check, download and upload
integrated. If this software is to be 'the killer translation server app', it
would need support for policies (eg. what is not allowed in translations, how
are some words always translated, filter for common mistakes ...) and support
for translation memory tied to policy (eg. because KDE translation guidelines
differ from GNOME's). It also needs support for searching, since a translator
can help himsef by searching for a similar string.
This is my view what Debian's i18n community will build in next 3 years or so.
A comment on existing projects:
Rewriting pootle functionality is not NIH or reinventing the wheel. It's just
that pootle is stuck with jToolkit which has scarce documentation so there's
no wonder it has only few developers. Pootle developers also need to develop
jToolkit, so they are actually maintaining pootle from ground up. By using a
more popular toolkit with more developers, eg. Django or TurboGears (or Zope
for that matter), software has more potential developers floating around, it
only needs to gain enough popularity. Pootle is around for a long time, but
hasn't really succeeded in this area, because it is supposed to be resource
hungry and frankly it's just not functional enough. I haven't had an actual
chance to test resource usage.
To make it a killer app, it needs to have enough features, have a functional
and intuitive interface and be fast. Fast software is always a joy to use.
And, quite important, have a catchy name that quickly gets to top of search
engines.
TransDict (the croatian software whose name Christian always forgets) could be
a better candidate, but unfortunately it's written in perl, which is greek to
me, so I can't judge.
This could be my proposal.
Kind regards,
Gasper Zejn
Reply to: