[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [gopher] CAPS capability: ServerDefaultCharset



On 01/03/2015 12:39 PM, Nuno Silva wrote:
Improperly rendered UTF-8 will easily become unreadable[1], which is my
main problem when mixing encodings. By "unreadable" I mean that you
can't get the meaning of the text.

Yes, but again, I had in mind people that *already* use utf-8 in the gopherspace, not mass conversion of existing stuff. In this situation, such CAPS setting can only help, and do no harm (worst case scenario: the gopher client ignores CAPS, and renders the content like it does currently).

Several languages require characters that are not part of ASCII,
including Finnish, Spanish, French and Portuguese.

And Polish, and many other. But these are "soft" problems, you got at least latin characters right, so lecture is possible. But try to read any cyrillic-based language (Ukrainian, Russian, Bulgarian...) - there, *every* character is scrambled.

Are there any gopher clients that try to autodetect whether the text is
utf8 or ISO-8859?

None that I know about.

(IF that's even possible without false positives - I guess it's easier with ISO-8859-1...)

On the contrary, it's much easier to identify UTF-8, since it uses clearly defined bit patterns. Detecting any 8-bit charset is a mess, as it requires statistical analysis of the content.

Mateusz

_______________________________________________
Gopher-Project mailing list
Gopher-Project@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/gopher-project




Reply to: