[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: fstrcmp



On Tuesday 02 June 2009 12:32:47 Michael Poole wrote:
> Jérôme Pouiller writes:
[...]
> > It is naive to think matching algorithm iterates on all items until
> > it find the correct one. At least, algorithm use a sorted index
> > with a dichotomy search.
> >
> > Nevertheless, your idea is interesting. But you should implement a
> > function to match the nearest string in a set of strings. Take a
> > look in spell checking libraries to have an idea how to implement
> > it.
>
> Isn't this optimization premature?  I would say: Package the library,
> implement the fuzzy matching, and if it is too slow for people to
> like the case where they misspell a package name, *then* optimize for
> run-time.  I would rather have the fuzzy matching sooner than have it
> shave a few milliseconds off the display time for a correction.

In another thread, Adeodato Simó wrote:
> I can't see how it'd work here, at least without the help of some
> on-disk structure, since we're talking about a space of 25,000
> packages.

Naive search of matching string under a set of 25,000 strings is 
something like 2000 times slower (maybe far more) than a correct 
algorithm. Adeodato thinks it is not usable. I said we shouldn't reject 
idea because of performances and I suggested a better algorithm should 
be used.


-- 
Jérôme Pouiller (jezz AT sysmic DOT org)


Reply to: