*To*: Daniel Silverstone <dsilvers@digital-scurf.org>, 206416@bugs.debian.org*Cc*: Falk Hueffner <falk.hueffner@student.uni-tuebingen.de>*Subject*: Bug#206416: dpkg package hash table insufficient*From*: Anthony Towns <aj@azure.humbug.org.au>*Date*: Thu, 21 Aug 2003 14:28:10 +1000*Message-id*: <20030821042810.GB11317@azure.humbug.org.au>*Reply-to*: Anthony Towns <aj@azure.humbug.org.au>, 206416@bugs.debian.org*In-reply-to*: <20030820210407.GA23066@digital-scurf.org>*References*: <20030820165617.GA20912@digital-scurf.org> <87d6f0w3gh.fsf@student.uni-tuebingen.de> <20030820210407.GA23066@digital-scurf.org>

On Wed, Aug 20, 2003 at 10:04:07PM +0100, Daniel Silverstone wrote: > Let's have a look... > *hauls out... > (c) Introduction to algorithms, Cormen, Leiserson & Rivest > and opens them up* > (c) Page 228 "The division method" > When using the division method, we usually avoid certain values of m. > For example, m should not be a power of 2... > ...good values for m are primes, not too close to exact powers of 2. Reading further: --- "The multiplication method" The /multiplication method/ for creating hash functions operates in two steps. First, we multiply the key /k/ by a constant /A/ in the range 0 < A < 1 and extract the fractional part of /kA/. Then, we multiply this value by /m/ and take the floor of the result. In short, the hash function is: h(k) = |_ m(kA mod 1) _| where "hA mod 1" means the fractional part of kA, that is kA - |_ kA _|. An advantage of the multiplication method is that the value of /m/ is not critical. We typically choose it to be a power of 2 -- m = 2^p for some integer /p/ -- since we can then easily implement the function on most computers as follows. [...] --- Apparently Knuth suggests A = (sqrt(5) - 1) / 2 ~= 0.6180339887..., so if the word size is w (32 bits, eg), then 0 <= k < 2^w, then s * 2^w + r = k |_ A * 2^w _| (note 0 <= s,r < 2^w), and the hash value is the p most significant bits of r. For comparison, britney uses: long strhash(const char *x, unsigned char pow) { unsigned long i = 0; while (*x) { i = (i * 39 + *x) % (1UL << pow); x++; } return i; } to generate an index into an array of 2^pow elements from package names. It works pretty well. pow is generally 14 (ie, a hash size of 16,384). Cheers, aj -- Anthony Towns <aj@humbug.org.au> <http://azure.humbug.org.au/~aj/> I don't speak for anyone save myself. GPG signed mail preferred. ``Is this some kind of psych test? Am I getting paid for this?''

**Attachment:
pgpmqErt6P1Ts.pgp**

**References**:**Bug#206416: dpkg package hash table insufficient***From:*Daniel Silverstone <dsilvers@digital-scurf.org>

**Bug#206416: dpkg package hash table insufficient***From:*Falk Hueffner <falk.hueffner@student.uni-tuebingen.de>

**Bug#206416: dpkg package hash table insufficient***From:*Daniel Silverstone <dsilvers@digital-scurf.org>

- Prev by Date:
**Bug#206500: dpkg: mouse within xterm - undesirable** - Next by Date:
**RE:Re: Approved** - Previous by thread:
**Bug#206416: dpkg package hash table insufficient** - Next by thread:
**Bug#206416: dpkg package hash table insufficient** - Index(es):