[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#206416: dpkg package hash table insufficient



Package: dpkg
Version: 1.10.10
Severity: normal
Tags: patch

Hi,

I have been working on getting dpkg to run a bit faster, and one of the
big time-wasters in it was the package hash table. Originally written to
cope with a suite of 300 or so packages, 128 buckets seemed okay and the
naive hash function was sufficient.

With a suite of 13,000 plus packages, the hash table was utterly
inadequate.

Attached is a patch which resizes the hashtable to 8191 chains of
buckets. This is the largest prime less than 8192 and offers an
improvement (over common things like dpkg -p or dpkg -l) of between 20
and 30 percent on the runtime. An option would be to reduce the table to
only 4093 chains, but that reduces the gains to between 20 and 25
percent.

The patch also replaces the naive hash implementation with an
implementation of FNV (Fowler/Noll/Vo) which is a good low-collision
hash function, giving nice even spreads across the hash chains.

The patch was developed against version 1.10.10 of dpkg but since
lib/database.c has hardly changed in goodness knows how many releases,
it should cleanly apply to CVS head or a 1.10.10 branch.

Regards,

Daniel

-- 
Daniel Silverstone                               http://www.digital-scurf.org/
Hostmaster, Webmaster, and Chief Code Wibbler          Digital-Scurf Unlimited
GPG Public key available from keyring.debian.org               KeyId: 20687895
You will remember something that you should not have forgotten.

Attachment: database.patch.gz
Description: Binary data


Reply to: