[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#206416: dpkg package hash table insufficient



severity 206416 important
thanks

On Wed, 20 Aug 2003, Daniel Silverstone wrote:

> I have been working on getting dpkg to run a bit faster, and one of the
> big time-wasters in it was the package hash table. Originally written to
> cope with a suite of 300 or so packages, 128 buckets seemed okay and the
> naive hash function was sufficient.
>
> With a suite of 13,000 plus packages, the hash table was utterly
> inadequate.
>
> Attached is a patch which resizes the hashtable to 8191 chains of
> buckets. This is the largest prime less than 8192 and offers an
> improvement (over common things like dpkg -p or dpkg -l) of between 20
> and 30 percent on the runtime. An option would be to reduce the table to
> only 4093 chains, but that reduces the gains to between 20 and 25
> percent.
>
> The patch also replaces the naive hash implementation with an
> implementation of FNV (Fowler/Noll/Vo) which is a good low-collision
> hash function, giving nice even spreads across the hash chains.
>
> The patch was developed against version 1.10.10 of dpkg but since
> lib/database.c has hardly changed in goodness knows how many releases,
> it should cleanly apply to CVS head or a 1.10.10 branch.

Hmm.  After reading this thread, my interest is quite high.  I'll check this
patch out this weekend, and if my tests pass, I'll approve it for inclusing
into 1.10.11.

Since you've done so much work, maybe you could do the same to the file
database?

ps: increased to important, to move it up in the list.




Reply to: