[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Konqueror, UTF-8



Adrian 'Dagurashibanipal' von Bidder wrote:

> You may want to put a screenshot online somewhere - konqueror 3.1.1-1 with 
> lucida as default font seems to display the greek name just fine. Also, the 
> greek debian.org homepage seems to display fine.

The Debian Greek page looks fine to me too, but it's not encoded in UTF-8;
it's in ISO-8859-7. It seems like an encoding problem, not a font problem.

> Another thing that might serve as a hint to the enlightened: if you copy-paste 
> from the konqueror window to some other unicode-enabled window (kwrite? 
> Mozilla mail composer?) do these chars appesr?

Cut-and-paste from Konqueror to KWrite gives the same messed-up Greek that
Konqueror itself shows. Cut-and-paste from Mozilla Firebird gives the
correct Greek. Saving the resulting file as UTF-8 and hexdumping shows
that the messed-up part is encoded as a UTF-8 "invalid character" code:

00000000: 46 72 2E 20  EF BF BD CE  BB CE B7 CE  B8 CE AD EF  Fr. ............
00000010: BF BD 2E 2E  CF 89 0A 46  72 2E 20 CE  91 CE BB CE  .......Fr. .....
00000020: B7 CE B8 CE  AD CF 85 CF  89 0A                     ..........

The first line (up to the newline at offset 0x16) is from Konqueror; the
rest is from Firebird. The UTF-8 sequence EF BF BD, if I recall
correctly, means "invalid character" (Unicode code point 0xFFFD).

So this suggests that either Konqueror is mangling the page's text, or
the server is sending it something different than it's sending to
Firebird. Which I suppose is possible, based on request headers or
something. I've tried using Konqueror's "lie about browser identity"
feature to pretend to be Mozilla or IE, but it makes no difference.

Thanks,

Craig

Attachment: pgpuWezVjOU6D.pgp
Description: PGP signature


Reply to: