[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [Kind of OT] Why's this look like gibberish to me?



Hi,

On Sat, Apr 26, 2008 at 03:57:19PM +0200, s. keeling wrote:
> Dotan Cohen <dotancohen@gmail.com>:
> > 
> >  Why are you against switching to UTF-8? Disk space? There really is
> >  no other disadvantage, and even the diskspace arguement doesn't
> >  count for much unless your drive is mostly uncompressed text files.
> 
> Why would I be _for_ switching?  I'm a unilingual Anglophile.  utf-8
> would gain me nothing.  I'm glad utf-8 (et al) finally exists for
> those of you who who can use it or need it.  However, it's irrelevant
> here.  I only know English, and can puzzle out some words in other
> related western European languages.
> 
> I'd guess my $HOME probably is mostly uncompressed text, source and
> documentation.

If you use only ASCII characters in UTF-8 encoded text file, it is
exactly same as ASCII file in size and contents.

Only when you have those alian characters, UTF-8 makes special multi
byte sequence. 

If you are talking alian character support bloating data size, it is not
UTF-8 encoded data.  The fixed width encoding system UCS-4 etc. used to
represent data in program tends to bload memory consumption of
application.  This memory consumption happens even if you use "C"
environment.

http://en.wikipedia.org/wiki/Universal_Character_Set
http://people.debian.org/~osamu/pub/getwiki/html/ch09.en.html#thelocale

Osamu


Reply to: