[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: glimpse sucks.



In article <[🔎] 20000920132900.A12463@cibalia.gkvk.hr>
joy@cibalia.gkvk.hr writes:

>> > > This is a showstopper, as we have 2 old and 2 new Chinese lists in the
>> > > archive. :/ It would be very bastardish to leave those lists with Glimpse.
>> > > Then again, I'm not sure Glimpse works fine with those lists, being so old,
>> > > made in a rather i18n-deprived times... Can someone confirm that searching
>> > > -chinese-* lists produces correct output, please? (Anthony?)
>> > 
>> > I've talked to the upstream people and while they are keen, they have no
>> > idea how to implement it.  The main problem is no of their programmers
>> > live in a place with dual-byte character sets.
>> 
>> Maybe they can get help from the people who developed namazu... just a hint.

Do you call me? :-)

Hmm... I don't know about Chinese. I think, it is hard to determine
word boundary in Chinese. So some word segmentation tools need for
processing Chinese (like kakasi, chasen in Japanese). I looked in the
output of "apt-get search chinese", but it seems there are no such
tool...

There is the another solution. It is "letter indexing
approach". However, that approach is more difficult to implement than
"word indexing approach". It sould be hard to implement it in Glimpse.
-- 
NOKUBI Takatsugu
E-mail: knok@daionet.gr.jp
	knok@debian.or.jp (Debian-JP)



Reply to: