[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Warning Linux Mint Website Hacked and ISOs replaced with Backdoored Operating System



On Tue, 23 Feb 2016, David Wright wrote:
> 1) I do what fdupes does, ie identify files (in a benevolent
>    environment) using the MD5 signature to detect duplicate
>    contents.

MD5 alone can be somewhat dangerous even in benevolent environments: if the
data sets are large enough or you are just unlucky, you are going to hit a
colision and corrupt-or-lose-data-on-dedup sooner or later.

At least use data-size + hash.  But even that won't save you for
colisions... the "full fix" is to use the hash (or size + hash) as a screen
to detect possible matches: when it matches, compare the two data-sets to
ensure they're really equal before you trigger the dedup.

I am not going to bother with the detail that you need to ensure one of the
data sets can't/didn't change under you between the comparison and the dedup
getting commited to storage.

> 2) In view of your statement that faster hashes exist, I would
>    like to explore replacing my use of MD5 by such a hash.

Any wide-enough hash will do if you use it just for screening, where you
don't care for for any security properties of the hash.  And at that point,
you might as well use a wide-enough CRC (ensure it is vectorizable and get
the compiler to vectorize it!) if it proves to be faster than crypto
hashes...

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh


Reply to: