[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: accented chars. shown as question marks in black diamonds in mozilla



Florian Kulzer wrote:


What I meant was this: Your utf-8 setup (combined with using the proper
fonts) is able to encode and display umlauts, accented characters,
characters for Slavic languages, Scandinavian, Russian, Greek, (some)
Asian characters, etc. This is in contrast to, say, someone using an
iso-8859-1 locale who cannot display many of these "foreign" characters.
(Unless s/he uses an application which can work around the limitations
of the system's encoding, for example LaTeX.)

The problem is that a webpage has to tell your browser which encoding it
uses to transmit the characters. If the browser has to guess things can
go wrong. In your case iceape guessed the page was encoded in utf-8
which goes wrong for many characters outside the standard us-ascii set.
Once you told your browser that the page was in iso-8859-1 it could
transcode properly. The root of the problem is that the character "é"
(the accented e) exists in both utf-8 and iso-8859-1 but it has a
different code in the two encodings.

Ah, that makes complete sense!


I was assuming they should have used UTF-8 along with the language tags around that word. I might be mistaken though.

This would maybe work if they would encode that word in utf-8. Since
they decided to use iso-8859-1 throughout the document they could simply
have included

<meta http-equiv="CONTENT-TYPE" content="text/html; charset=iso-8859-1">

in the HTML header.

Thanks for your excellent explanation.
regards,
->HS






Reply to: