Bug#99324: Default charset should be UTF-8
On 30-May-01, 22:25 (CDT), Cesar Eduardo Barros <firstname.lastname@example.org> wrote:
> > - Making sure everything works with UTF-8 charset
On Fri, Jun 01, 2001 at 01:38:32PM -0500, Steve Greenland wrote:
> Does this mean, for example, that cron and crontab would have to be
> recoded to support wide or multibyte characters?
"works with UTF-8" doesn't mean the same thing as "recoded to support
wide or multibyte characters".
For programs with no relevant text manipulation facilities (cron and
crontab), it's sufficient that UTF-8 is not mutilated. [UTF-8 is
designed, remember, to be represented in terms of 8 bit characters.]
[For example, the linux kernel currently defaults to UTF-8 -- which
means almost nothing in most contexts.]
> If so, I object to making this a requirement.
I also object to any "requirements" which break existing software.
I don't see that has much to do with what's being discussed here.
> I also wonder about the performance impact, and the size impact
> (although I understand that UTF-8 uses single byte for ascii equivalent,
> so that shouldn't be much, right?)
Correct. If someone puts UTF-8 into an error message or onto a command
line, that message or line will be different from what it was before.
That is all.
The trick isn't UTF-8. The trick is incompatible character sets (a
problem we already have with national language character sets).
The actual pain that will be experienced in making UTF-8 the default
character set would be experienced by those people who rely the current
default (which in some contexts is ISO-8859-1 while in other contexts
the character set is explicitly chosen so there really isn't a default
to worry about). Any other problem, we already have.