[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: On some broken Japanese mails

On July 16, 2006 at 2:00PM +0900,
cwryu (at debian.org) wrote:

> Japanese mails have been always correctly encoded than ones in other
> languages.  But recently I have received several Japanese spams broken
> in evolution and thunderbird.  These mails have RFC2047 ISO2022-JP
> encoded headers, but the actual contents are in shift jis.  For example:
> =?iso-2022-jp?B?glKCT5VigsWPgJT1iq6XuQ==?=  
> ("echo glKCT5VigsWPgJT1iq6XuQ== | mimencode -u -b" to decode it)
> I wonder (1) what program generates this wrong encoded header.  Or (if
> the program is just an ignorable spam bot)

I feel this header means 99%:spam that might use unskillful scripts.
Most Japanese mailers don't generate this header.

> (2)whether the common mail
> readers (outlook or some web mails) read this header correctly.  If the
> most Japanese users can read this header by some workaround of their
> mail reader, it may be worth to patch evolution or thunderbird.

At least EZweb mailer (mobile phone by KDDI) can read this header.
It seems to be auto-detection for ISO-2022-JP, Shift_JIS and US-ASCII,
but other encodings, such as UTF-8, ISO-8859-1, EUC-JP, are unsupported.

My favorite mailers Mew and Wanderlust don't support this header,
but I don't feel inconvenience.  To investigate a spam message, I
can modify the header by hand.

For message body, Mew supports auto-detection, and the other
charset name or language name can be set.  e.g. `C-u C-c C-l
shift_jis RET' decodes message body as Shift_JIS even if
charset="ISO-2022-JP".  `C-c C-l Japanese RET' is auto-detection
that prefers ISO-2022-JP, EUC-JP and Shift_JIS to other encodings.

BTW, before MIME was defined, most Japanese mailers used ISO-2022-JP
without MIME charset name.  So, auto-detection for ISO-2022-JP is
necessary to read the old de fact standard Japanese mails.

Tatsuya Kinoshita

Reply to: