[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: DDTP: Please remove pl_PL and merge it with pl

On Sat, Jul 14, 2007 at 01:54:20PM +0200, Jens Seidel wrote:
> On Sat, Jul 14, 2007 at 03:01:13AM -0500, Ming Hua wrote:
> > Chinese must be separated as (at least) zh_CN and zh_TW, as although
> > they are the same language, they use different scripts.  I have no idea
> > what script the current zh translations are, is there an easy way to
> > review them?  I'll do that and put them into correct category.
> > 
> > Also please make sure DDTP/DDTSS do not accept ambiguous zh translations
> > anymore.
> Can you please check http://ddtp.debian.net/debian/dists/sid/main/i18n/
> and merge zh translation manually with either zh_CN or zh_TW (via mail
> interface), we cannot do this without knowledge of the language and
> don't wont to drop all 22 translations.

Of course, I didn't suggest to drop them.  I've now downloaded the
Translation-zh file and found out that they are all simplified (zh_CN)
translations, so should be merged into zh_CN.  I'll do this via email
interface in the following days (I need to learn using DDTP), and it's
safe to turn off accepting zh translations now.

> But are you really sure that it is not possible to convert a common
> Chinese translation into zh_CN AND zh_TW?

I'm so glad that you brought this up again.  I was reading the thread
"Re: DDTP - please activate support for pt" yesterday and found you've
mentioned Chinese translation in that thread.  I wanted to reply, only
to realize your mail was in February.

> Please note that this is done
> by the Debian website, there is only a single translation but multiple
> output encodings!?

I know for website translation, zh_CN and zh_TW pages are generated from
a single source file.  However, it's not exactly a single translation,
the source wml file supports the grammar "[CN:foo][HKTW:bar]", so that
the generated page will use "foo" for zh_CN html and "bar" for zh_TW
html.  It's quite a maintenance hassle.

Also, the difference between scripts is not only encoding (both
zh_CN and zh_TW translations can, and prefer to, use UTF-8 nowadays, at
least in the open source world).  Encoding is not even the main part of
the conversion.  The website's wml source is usually in zh_CN or zh_TW
depending on the preference of the main translator (as wml doesn't
support UTF-8, IIRC), then the conversion to the other script is done by
a third-party program (iconv is not really good enough for this task).
I think currently the tools in zh-autoconvert package are used.  The
result, if not touched up by a translator of the other script (by adding
more [zh:foo][HKTW:bar] alternative tags), will read awkward most of the
time, and sometime even confusing.

So you see, generating both zh_CN and zh_TW translations from a single
source is not really ideal.  IMHO the maintenance hassle, as well as
the suboptimal results, is one of the reason that Chinese website
translations have been stagnant these years.

> Could someone please explain this? Why waste time for two
> encodings/scripts if one is sufficient?

So in short, it's not an encoding issue.  I only know English and
Chinese, but I suspect the difference is probably on par with nn
(Norwegian Nynorsk) and nb (Norwegian Bokmål).  One translation is not
possible.  One source is possible, but inconvenient and suboptimal.
Both zh_CN translators and zh_TW ones (though I can't speak for them)
would prefer separate translations.

Hope this makes things a bit more clear.


Reply to: