[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Plan C (was Re: Plan B for fixing 5.8.2 binary API)



Nicholas Clark wrote:

Plan C is to add a per hash flag to say if the flag is using a non-standard
hash function. This defaults to off (ie standard hash function).
So normally nothing extra happens (except for 1 extra flag check)

Any hash that is flagged as non-standard hashing throws away any passed in
pre-computed hash, and rehashes using the custom function. (which will be
the same for all custom hashes - the algorithm as of 5.8.1 with the random
seed). All this rehashing happens entirely in hv.o, so we have plan A
behaviour there.

hsplit is modified to count the length of the longest linked lists as it
splits a hash. If it finds that the longest is over some threshold (eg 50%
of all hash values are in one list after splitting) then the data is
pathological, and *that hash* switches strategy.
hsplit sets non-standard hashing, turns off shared hash keys, and iterates
over the entire hash, recalculating hash values (using the "custom"
function).

I'm probably being dense, but what does switching behaviour on the fly gain us? Why not just use the 2nd strategy *all* the time, and avoid the need for a flag check?

Me no understand...

--
Alan Burlison
--



Reply to: