On Tuesday 02 June 2009 12:32:47 Michael Poole wrote:
> Jérôme Pouiller writes:
> > It is naive to think matching algorithm iterates on all items until
> > it find the correct one. At least, algorithm use a sorted index
> > with a dichotomy search.
> > Nevertheless, your idea is interesting. But you should implement a
> > function to match the nearest string in a set of strings. Take a
> > look in spell checking libraries to have an idea how to implement
> > it.
> Isn't this optimization premature? I would say: Package the library,
> implement the fuzzy matching, and if it is too slow for people to
> like the case where they misspell a package name, *then* optimize for
> run-time. I would rather have the fuzzy matching sooner than have it
> shave a few milliseconds off the display time for a correction.
In another thread, Adeodato Simó wrote:
> I can't see how it'd work here, at least without the help of some
> on-disk structure, since we're talking about a space of 25,000
Naive search of matching string under a set of 25,000 strings is
something like 2000 times slower (maybe far more) than a correct
algorithm. Adeodato thinks it is not usable. I said we shouldn't reject
idea because of performances and I suggested a better algorithm should
Jérôme Pouiller (jezz AT sysmic DOT org)