[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

"Sitemap" webpage



Hi,

I found that some items of Japanese version of "Sitemap" page
are broken.

   http://www.debian.org/sitemap.ja.html

I researched this problem and found the reason.  However, before
explaining it, I will have to explain the encoding used for
Japanese web pages.

Japanese web pages (wml sources) are written in ISO-2022-JP
encoding.  In this encoding, Japanese characters (from JIS X 0208
character set) are represented as a pair of bytes in range of
0x21 - 0x7e, i.e., exactly same as ASCII.  (ISO-2022-JP encoding
consists from ASCII and JIS X 0208 character sets.)  Distinction
between ASCII characters and Japanese (JIS X 0208) characters is
done by using escape sequence.

When the corresponding Japanese wml page has a Japanese title
(in #use wml::debian::template title="xxxx" line) which includes
a Japanese character which include include 0x22 (DOUBLE QUOTE)
in its pair of bytes, a problem occurs.  It seems that wml
parser confuses the accidental 0x22 as a quote character.

The followings are examples of pages which suffer this problem.

   webwml/japanese/related_links.wml
   webwml/japanese/ports/index.wml
   webwml/japanese/ports/mips/index.wml

Does anyone have any idea to solve this problem?

---
Tomohiro KUBOTA <kubota@debian.org>
http://www.debian.or.jp/~kubota/
"Introduction to I18N"  http://www.debian.org/doc/manuals/intro-i18n/



Reply to: