[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: "Sitemap" webpage



Hi,

At Fri, 6 Jul 2001 17:21:34 +0200,
Josip Rodin <joy@cibalia.gkvk.hr> wrote:

>>   my $title = `egrep '^#use .* title=' $page `; chomp $title;
>>   $title =~ s/^#use .* title="([^"]+)".*$/$1/;

> I suppose we could just change that regexp to match everything after the
> opening double quote up to the next space, and strip off the ending quote.

Nice idea.

Just an improvement:  I think some titles may include whitespaces.
Thus, the end of the title should be checked by two continuing
bytes of double quote and following whitespace/linefeed byte.
Since ISO-2022-JP cannot end with JIS X 0208 shift state (where
0x22 may appear for Japanese characters), 0x22 cannot appear at
the end of the title string.  And more, since the byte range of
JIS X 0208 is 0x21 - 0x7e, 0x22-and-whitespace bytes cannot
appear in ISO-2022-JP strings.

Could someone CVS committer please implement this to
webwml/english/sitemap.wml ?

---
Tomohiro KUBOTA <kubota@debian.org>
http://www.debian.or.jp/~kubota/
"Introduction to I18N"  http://www.debian.org/doc/manuals/intro-i18n/



Reply to: