[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: "Sitemap" webpage



Hi,

Thank you again for submitting your fix.  Now the page is good.

At Sat, 7 Jul 2001 09:29:02 +0200 (CEST),
peter karlsson <peter@softwolves.pp.se> wrote:

> and those were not matched properly. However, I seem to have missed a
> quotation mark missing in the regexp, it should read:
> 
>    $title =~ s/^#use .* title="(.+?)(" .*$|"$|"\e.*$)/$1/;
>                                               ^

Imagine an ISO-2022-JP string has a JIS X 0208 part and following
ASCII part.  When the JIS X 0208 part ends with 0x22, it matches "\e
and thus the regexp will fail.

Rather, I think the Japanese pages like

>    title="<switch to 0208>DBCS<switch to 0201>"<switch to ASCII><space>

is not appropriate, even though it is not illegal.  I imagine such
pages were generated by some buggy editors or some unappropriate 
usage of editors.  Thus, I think the Right Thing is to remove "\e
from the regexp.  However, even though I can fix these pages now,
if there are some possibility for such pages to be generated accidentally
in future, the sitemap page will have some possibility to be broken.
Thus, I think your way is appropriate, though not strictly correct.

---
Tomohiro KUBOTA <kubota@debian.org>
http://www.debian.or.jp/~kubota/
"Introduction to I18N"  http://www.debian.org/doc/manuals/intro-i18n/



Reply to: