[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: UTF-8, CJK and file size



Drew Parsons wrote:

> Being sensitive to potential accusations of Western
> imperialism, my question is whether this file size
> increase is something that Asian computer users have
> strong feelings about?  Will it be a major stumbling
> block hindering the acceptance of the UTF-8 encoding?
> Or is it a non-issue?

I am not very knowledgable about Kanji and other Oriental
languages, so I could be way off base.  However, I would
not think it matters.

As I understand it, their character sets cannot be reduced
out of context to less than 256 unique symbols that cover
the vast majority of everyday language usage.  Were they
to develop a hardware/software platform from the ground
up to maximize the efficiency of "text" acquisition,
processing, and rendering, the data files would still
be dramatically more dense than ASCII equivalents.

Some languages are more wordy than others; e.g., I recall
hearing the French translate "vacumn cleaner" literally as
"machine that cleans by sucking air"--not to pick on them,
but to point out every language has its peculiar needs
and thus file space requirements.

It's not a matter of imperialism at all.  This density may
be an asset, not a liability.  They may be able to express
thoughts Westerners cannot effectively put into words.  The
apparent weakness may be a strengthening influence not
unlike left-handedness.

Your sensitivity is excellent and commendable, but I don't
think this is a significant matter.



Reply to: