[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Search engine for documentation indexing?



Ian Zimmerman writes:
> 3. xapian-omega.  This seems to be the one modern apps are migrating to,
> I heard of the Gnus mail/newsreader acquiring a xapian based search
> function.  But, out of the box it cannot index gzipped files (and most
> documents in /usr/share/doc other that HTML pages are gzipped), and
> there doesn't seem to be a way to add a user-defined filter either
> to compensate for this (swish-e has user filters).

Automatically uncompressing gzipped files for indexing isn't hard to
do, but what can you link to for them in the search results?  Of the
four web browsers I just tried, only w3m showed the contents of
file:///usr/share/doc/coreutils/README.gz rather than downloading
it for me.  Same for http://localhost/doc/coreutils/README.gz it
seems.

Currently you have to modify the source to add new support for new file
formats, but there is at least a detailed FAQ entry which leads you
through how to do that:

http://trac.xapian.org/wiki/FAQ/OmegaNewFileFormat

This really should be possible via a configuration file, but nobody's
got around to sorting that out yet.

But as others have said, recoll is probably a better choice for a
Xapian-based solution for a desktop situation anyway.

Cheers,
    Olly


Reply to: