[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#206416: dpkg package hash table insufficient



On Wed, Aug 20, 2003 at 10:04:07PM +0100, Daniel Silverstone wrote:
> Let's have a look...
> *hauls out...
>  (c) Introduction to algorithms, Cormen, Leiserson & Rivest
> and opens them up*
> (c) Page 228 "The division method"
>   When using the division method, we usually avoid certain values of m.
>   For example, m should not be a power of 2...
>   ...good values for m are primes, not too close to exact powers of 2.

Reading further:

---

"The multiplication method"

The /multiplication method/ for creating hash functions operates in two
steps. First, we multiply the key /k/ by a constant /A/ in the range 0 < A
< 1 and extract the fractional part of /kA/. Then, we multiply this value
by /m/ and take the floor of the result. In short, the hash function is:

	h(k) = |_ m(kA mod 1) _|

where "hA mod 1" means the fractional part of kA, that is kA - |_ kA _|.

An advantage of the multiplication method is that the value of /m/ is not
critical. We typically choose it to be a power of 2 -- m = 2^p for some integer
/p/ -- since we can then easily implement the function on most computers as
follows. [...]

---

Apparently Knuth suggests A = (sqrt(5) - 1) / 2 ~= 0.6180339887...,
so if the word size is w (32 bits, eg), then 0 <= k < 2^w, then

	s * 2^w + r = k |_ A * 2^w _|

(note 0 <= s,r < 2^w), and the hash value is the p most significant
bits of r.

For comparison, britney uses:

long strhash(const char *x, unsigned char pow) {
    unsigned long i = 0;
    while (*x) {
        i = (i * 39 + *x) % (1UL << pow);
        x++;
    }
    return i;
}

to generate an index into an array of 2^pow elements from package
names. It works pretty well. pow is generally 14 (ie, a hash size
of 16,384).

Cheers,
aj

-- 
Anthony Towns <aj@humbug.org.au> <http://azure.humbug.org.au/~aj/>
I don't speak for anyone save myself. GPG signed mail preferred.

       ``Is this some kind of psych test?
                      Am I getting paid for this?''

Attachment: pgpWS4iIkZYwy.pgp
Description: PGP signature


Reply to: