[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: MD5 collisions found - alternative?



On Thu, Aug 26, 2004 at 01:04:21AM +0200, Almut Behrens wrote:
> ...and I think somewhere in between lie hashing functions like crc32,
> as used for detecting transmission errors, for example.  Those are
> not cryptographic, but possess a sufficiently large output space, so we
> can expect few random collisions for most practical purposes.

I wouldn't call CRC a hash code although you can use it that way
I guess. It is really an error detecting and correction
code that does have the ability, in a sense, to go backwards. It 
stands for Cyclic Redundancy Check. Such codes are redudant data,
to be included with a transmission, not a hash. Some of them
allow correction of multiple bit errors; typically you can detect
1 bit of error more than you can correct.
 
> Right, but I believe that a uniform mapping _also_ is a desirable
> property (besides speed, of course) of hashing functions as used to
> compute table lookup indices -- as this property assures that the data
> storage locations will be spread as evenly as possible across the
> available buckets, which in turn minimizes the time spent on resolving
> collisions (on average).  And, as practical considerations in this case
> always enforce a rather small output space (i.e. number of buckets)
> we're certainly expecting collisions here. [1]

Well, in theory yes. In practice you usually aren't much fussed
if you've got a variance in the bucket utilization unless you're
working on something with a real need for speed. For example, I
would bet there are some really, really good uniform hash functions
used inside of gcc. 
 
> > 	* randomness. Input strings which differ by 1 bit in any
> > 	  position generate hash keys a random distance apart
> 
> I'd add:
>   * huge size of the output space (with its upper limit corresponding
>     to the number of bits of the hash value).  The probability of
>     accidentally finding a collision is of course directly related to
>     the size of the output space (assuming a uniform mapping).

Yes, can't argue there. That is where the basic difference between
a typical hash function and a cryptographic hash comes. You want
small keyspaces and very simple functions to generate lookup keys,
whereas you don't much care about the function overhead for a
cryptographic key as you tend not to do the encodings as often.
 
> If anyone knows of any other requirements, please feel free to chime
> in... Well, OTOH, this would probably be getting a little off-topic for
> debian-security (especially the debian aspect).

-- 
------------------------------------------------------
   Dale Amon     amon@islandone.org    +44-7802-188325
       International linux systems consultancy
     Hardware & software system design, security
    and networking, systems programming and Admin
	      "Have Laptop, Will Travel"
------------------------------------------------------

Attachment: signature.asc
Description: Digital signature


Reply to: