Re: default character encoding for everything in debian

On Mon, Aug 10, 2009 at 01:45:40PM +0200, Siggy Brentrup wrote:
> On Mon, Aug 10, 2009 at 13:09 +0200, Thomas Koch wrote:
> > Hi,
> > 
> > I've an issue, that I forgot to set the character encoding of tomcat to utf-8 
> > after reinstalling a server.
> > Now, before I report a wishlist(?) bug to tomcat, I want to ask (and invite to 
> > discuss) shouldn't utf8 be the default character set everywhere? So when 
> > installing a package from Debian I can assume that where a character encoding 
> > can be set, it't set to utf8.
> > MySQL would be another example, which to my knowledge uses isoXYZ as default 
> > character encoding.
> While utf-8 covers the broadest set of character glyphs possible, it
> suffers from size as well as performance penalties. Characters no
> longer are guaranteed to fit in a byte, how do you define
> strlen(utf8_string) &c pp.  All these issues have been solved but not
> for free.

Of course there's a penalty for certain operations.  But UTF-8 is about
as compact as an extended encoding is going to get.

> There are a lot of users out there that are not willing to pay the price
> for increased generality.

These users will need to change their character encoding to something else.
But the Debian default should remain UTF-8.  Those not willing to pay the
flexibility/performance tradeoff are the exception, and will need to
customise their environment accordingly.


