Re: fstrcmp

To: debian-devel@lists.debian.org
Subject: Re: fstrcmp
From: Jérôme Pouiller <jezz@sysmic.org>
Date: Tue, 2 Jun 2009 14:32:06 +0200
Message-id: <[🔎] 200906021432.06846.jezz@sysmic.org>
In-reply-to: <[🔎] 87ab4rylgw.fsf@graviton.dyn.troilus.org>
References: <1243734577.27145.6.camel@hawk> <[🔎] 200906021124.59303.jezz@sysmic.org> <[🔎] 87ab4rylgw.fsf@graviton.dyn.troilus.org>

On Tuesday 02 June 2009 12:32:47 Michael Poole wrote:
> Jérôme Pouiller writes:
[...]
> > It is naive to think matching algorithm iterates on all items until
> > it find the correct one. At least, algorithm use a sorted index
> > with a dichotomy search.
> >
> > Nevertheless, your idea is interesting. But you should implement a
> > function to match the nearest string in a set of strings. Take a
> > look in spell checking libraries to have an idea how to implement
> > it.
>
> Isn't this optimization premature?  I would say: Package the library,
> implement the fuzzy matching, and if it is too slow for people to
> like the case where they misspell a package name, *then* optimize for
> run-time.  I would rather have the fuzzy matching sooner than have it
> shave a few milliseconds off the display time for a correction.

In another thread, Adeodato Simó wrote:
> I can't see how it'd work here, at least without the help of some
> on-disk structure, since we're talking about a space of 25,000
> packages.

Naive search of matching string under a set of 25,000 strings is 
something like 2000 times slower (maybe far more) than a correct 
algorithm. Adeodato thinks it is not usable. I said we shouldn't reject 
idea because of performances and I suggested a better algorithm should 
be used.


-- 
Jérôme Pouiller (jezz AT sysmic DOT org)

Reply to:

Follow-Ups:
- Re: fstrcmp
  - From: Daniel Burrows <dburrows@debian.org>
- Re: fstrcmp
  - From: Michael Poole <mdpoole@troilus.org>

References:
- Re: fstrcmp
  - From: Jérôme Pouiller <jezz@sysmic.org>
- Re: fstrcmp
  - From: Michael Poole <mdpoole@troilus.org>

Prev by Date: Re: Bug#531221: okular: Arbitrarily enforces DRM
Next by Date: Re: fstrcmp
Previous by thread: Re: fstrcmp
Next by thread: Re: fstrcmp
Index(es):
- Date
- Thread