Package: dpkg Version: 1.10.10 Severity: normal Tags: patch Hi, I have been working on getting dpkg to run a bit faster, and one of the big time-wasters in it was the package hash table. Originally written to cope with a suite of 300 or so packages, 128 buckets seemed okay and the naive hash function was sufficient. With a suite of 13,000 plus packages, the hash table was utterly inadequate. Attached is a patch which resizes the hashtable to 8191 chains of buckets. This is the largest prime less than 8192 and offers an improvement (over common things like dpkg -p or dpkg -l) of between 20 and 30 percent on the runtime. An option would be to reduce the table to only 4093 chains, but that reduces the gains to between 20 and 25 percent. The patch also replaces the naive hash implementation with an implementation of FNV (Fowler/Noll/Vo) which is a good low-collision hash function, giving nice even spreads across the hash chains. The patch was developed against version 1.10.10 of dpkg but since lib/database.c has hardly changed in goodness knows how many releases, it should cleanly apply to CVS head or a 1.10.10 branch. Regards, Daniel -- Daniel Silverstone http://www.digital-scurf.org/ Hostmaster, Webmaster, and Chief Code Wibbler Digital-Scurf Unlimited GPG Public key available from keyring.debian.org KeyId: 20687895 You will remember something that you should not have forgotten.
Attachment:
database.patch.gz
Description: Binary data