Nick Demou wrote:
<disclaimer>ROUGH EXPLANATIONS</> when one writes a text in a text-editor the text-editor must store it in the disk as a series of numbers (for example ABC will become 65,66,67) this is called encoding the text when your browser renders that text in the screen it must convert the series of numbers to glyphs of letters (for example 65,66,67 will be presented as ABC) this is called decoding in order for this to work the two programs (text editor and browser) should agree in order to use the same rules of conversion (for example A<->65, B<->66,...)
I am familiar with the above.
this is where everything gets messed up because there are more than one possible encoding rules and web server, a database server, a lot of programmers and sysadmins and heaven knows what else in between the two programs. You the user then, must try a few possible encoding and see what works. Not too difficult just use the view->encoding menu. Still it is annoying
Right.
in the case of this page the text is really encoded as iso8859-1 (as you can find out if you manually select this encoding when everything displays properly) but the html code reports that it's text is encoded as UTF-8 (as you can see if you look at the first lines of the html source: content="text/html; charset=utf-8" - you can see the source with menu->view->page source). So its a problem that only time.com can solve properly
For a moment pretend that I am the person responsible to do that (HTML programmer or HTML editor or whatever). What would I do to resolve this?
My guess: use an HTML editor which supports UTF-8? Then the tag in the web page, content="text/html; charset=utf-8", would specify the encoding, the editor would input proper encoding of the character and my UTF-8 enabled browser should show the characters exactly as they were typed(?)
->HS