Re: "Sitemap" webpage
At Fri, 6 Jul 2001 17:21:34 +0200,
Josip Rodin <email@example.com> wrote:
>> my $title = `egrep '^#use .* title=' $page `; chomp $title;
>> $title =~ s/^#use .* title="([^"]+)".*$/$1/;
> I suppose we could just change that regexp to match everything after the
> opening double quote up to the next space, and strip off the ending quote.
Just an improvement: I think some titles may include whitespaces.
Thus, the end of the title should be checked by two continuing
bytes of double quote and following whitespace/linefeed byte.
Since ISO-2022-JP cannot end with JIS X 0208 shift state (where
0x22 may appear for Japanese characters), 0x22 cannot appear at
the end of the title string. And more, since the byte range of
JIS X 0208 is 0x21 - 0x7e, 0x22-and-whitespace bytes cannot
appear in ISO-2022-JP strings.
Could someone CVS committer please implement this to
Tomohiro KUBOTA <firstname.lastname@example.org>
"Introduction to I18N" http://www.debian.org/doc/manuals/intro-i18n/