[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: default character encoding for everything in debian



On Mon, 10 Aug 2009 13:45:40 +0200
Siggy Brentrup <debian@psycho.i21k.de> wrote:

> On Mon, Aug 10, 2009 at 13:09 +0200, Thomas Koch wrote:
> > Hi,
> > 
> > I've an issue, that I forgot to set the character encoding of
> > tomcat to utf-8 after reinstalling a server.
> > Now, before I report a wishlist(?) bug to tomcat, I want to ask
> > (and invite to discuss) shouldn't utf8 be the default character set
> > everywhere? So when installing a package from Debian I can assume
> > that where a character encoding can be set, it't set to utf8.
> > MySQL would be another example, which to my knowledge uses isoXYZ
> > as default character encoding.
> 
> While utf-8 covers the broadest set of character glyphs possible, it
> suffers from size as well as performance penalties. Characters no
> longer are guaranteed to fit in a byte, how do you define
> strlen(utf8_string) &c pp.  All these issues have been solved but not
> for free.
> 
> There are a lot of users out there that are not willing to pay the
> price for increased generality.

Don't you mean s/users/programmers? As a user I don't see what price I
pay. I only see advantages in having a consistent encoding. Which,
btw., doesn't have to be UTF-8. In an ideal world every programme would
adhere to LC_CTYPE. But if the encoding has to be configured then I
would also prefer UTF-8 as the default.

Of course, for the programmer there might be a price to pay. And if
he's not willing to pay it, he can't be forced, anyway.

Or do you mean the user pays the price, because if the encoding is set
to UTF-8 then performance would suffer? In that case, I'd love to see
some real life numbers. I doubt the difference would be noticeable. 

Cheers,
harry

Attachment: signature.asc
Description: PGP signature


Reply to: