[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#206416: dpkg package hash table insufficient



On Wed, Aug 20, 2003 at 09:09:50PM +0200, Falk Hueffner wrote:
> > Attached is a patch which resizes the hashtable to 8191 chains of
> > buckets. This is the largest prime less than 8192 and offers an
> > improvement (over common things like dpkg -p or dpkg -l) of between
> > 20 and 30 percent on the runtime. An option would be to reduce the
> > table to only 4093 chains, but that reduces the gains to between 20
> > and 25 percent.
> There is no reason to make the number of buckets a prime number unless
> either your hash function really sucks or you use double hashing. In
> fact, it is a bad idea to do so, since modulo powers of two is a lot
> faster than general modulo.

Let's have a look...

*hauls out...
 (a) Practical Algorithms for Programmers, Binstock & Rex
 (b) The Art of Computer Programming Vol 3. "Sorting and Searching", Knuth
 (c) Introduction to algorithms, Cormen, Leiserson & Rivest

and opens them up*

(a) Page 90, Section 'Performance Issues' in the Hashing chapter

 Bullet point two:

 o Be sure that the number of slots in each hash table is a prime number

(a) goes on to say:

 Common sense tells us that the maximum dispersal of remainders occurs
 when we divide by a prime number. Practice bears out this assumption.
 The worst possible division is to divide by a multiple of 2 since this
 serves only to mask off bits in the number being divided.

(b) Page 516

 ... such considerations suggest we choose m to be a prime number ...

(c) Page 228 "The division method"

  When using the division method, we usually avoid certain values of m.
  For example, m should not be a power of 2...
  ...good values for m are primes, not too close to exact powers of 2.


So, ultimately, I agree that 8191 isn't the ideal number, perhaps we
want to drop a few back from there and consider 8161 instead.

Regards,

Daniel.
 
-- 
Daniel Silverstone                               http://www.digital-scurf.org/
Hostmaster, Webmaster, and Chief Code Wibbler          Digital-Scurf Unlimited
GPG Public key available from keyring.debian.org               KeyId: 20687895
You are deeply attached to your friends and acquaintances.



Reply to: