[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: lists.debian.org de-localization (Re: automatically-generated ISO-8859-1 characters in mulbibyte webpages)

>>>>> "Marco" == Marco d'Itri <md@Linux.IT> writes:

    Marco> It would be *MUCH* better to just refuse these
    Marco> messages. Most of them are spam anyway.  At least in my
    Marco> country (and in all western europe, I think) raw latin-1
    Marco> characters in headers are never found outside of non-spam
    Marco> messages.

He did say "Russian."  On xemacs-users-ru, which is dedicated to
Russian-language posts, about half the users use RFC-2047 encoded-words,
and the rest are split evenly between ASCII-only and 8-bit Cyrillic.
"Raw Cyrillic in headers" is used by some of the more sophisticated
users, too, surprisingly enough.

This is a fairly small sample (about 100 subscribers, 25 regular
posters).  However, the Russian spam I've seen (isn't it funny how you
can identify spam even though you can't read the language it's written
in?) invariably fails either the addressee tests (implicit, too many),
the known spam software test, or the HTML-only test.  So (FWIW) I've
disabled the 8-bit test and so far the Russian subscribers are happy.

I will also say I've seen a fair amount of dumbquotes from MS-encumbered
posters, and the occasional accented Latin character from French and
German posters (although those are quite rare, but not quite nonexistent).

    Marco> /^Subject: .*[^[:print:]]{8}/   REJECT Your mailer is not \
    Marco> RFC 2047 compliant

If you're going to do that, 8 is probably too many (SPC is not an
8-bit character---I find 3 works well) and the reason should be
failure to comply with RFC 2822.  AFAIK 2047 does not prohibit 8-bit
characters, it simply provides a mechanism to encode them in
environments where they are prohibited.

Institute of Policy and Planning Sciences     http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.

Reply to: