[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Windows character sets and Linux



* Paul M Foster (paulf@quillandmouse.com) [031201 20:21]:
> I've often seen webpages where certain characters (primarily things like 
> apostrophes, quotes and such) show as '?' under Linux. I believe this is 
> a problem with character sets in Windows versus Linux. I'm assuming that 
> if I include the proper "locale" in Linux, this problem will go away. 
> Does anyone know how to solve this, and what the character set is which 
> Windows uses (in American English)?

Most often this is because the author of that web page is sending you
garbage.  Windows uses "codepages", and the one most commonly used is
1252, sometimes written as "windows-1252" or "CP-1252".  I'm pretty sure
this "codepage" coincides with ASCII for lower-numbered characters, but
not with ISO-8859-1 for other characters.  So when you get a page whose
Content-Type claims ISO-8859-1 with windows-1252 apostrophes in it, they
render as the garbage that they are.  Basically, the problem is on their
side -- I'd call it a problem with the web site files themselves; the
site maintainer should either fix the enconding of the files or properly
inform their web server what encoding those files are so that it can
send an accurate Content-Type.

In the meantime, since I wouldn't hold my breath waiting for all of
those sites to be fixed, Mozilla has a "Character Coding" menu item on
its View menu which allows you to manually select a character coding to
override the automatic selection (which usually just means whatever the
webserver claims it is).  Try selecting "Western (windows-1252)" and
those apostrophes should render correctly.

good times,
Vineet
-- 
http://www.doorstop.net/
-- 
"They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety."  --Benjamin Franklin

Attachment: signature.asc
Description: Digital signature


Reply to: