Re: Test search engine
In article <[🔎] 20000908164305.A6425@eye-net.com.au>
csmall@eye-net.com.au writes:
>> For the double-byte character guys there some bad news, it apparently
>> doesn't handle these yet.
I did not check umdsearch, however, it should need "word segmentation"
process for some languages (like Japanese).
There are no space between words in some languages. Therefore a
boundary of words is not clear in such languages.
kakasi and chasen can segment Japanese words. I don't know about other
languages...
If a multilingual word segmentation tool will available, i18n serach
engine would be made.
--
NOKUBI Takatsugu
E-mail: knok@daionet.gr.jp
knok@debian.or.jp (Debian-JP)
Reply to: