I'm not a Debian developer, just a curious onlooker who hasn't seen all of these messages, so I could completely off base with my understanding of how things work. But, it was my understanding that the bundled MD5 inside a .deb file isn't there for security, it's just there to make sure the packages arrived in one piece and weren't corrupted, and for that purpose it's still perfectly adequate. The "security", or validity of the packages' origin, are ensured by the digital signature on the packages or repos. A malicious package forged to match a desired MD5 would still fail a digital signature check.
On 2024-11-07 21:30:26 -0500 (-0500), Jeffrey Walton wrote:
On 2024-11-07 16:45:54 -0500 (-0500), David Campbell wrote:
[...]
dpkg currently uses MD5 to verify packages, but MD5 is considered
insecure, why not switch to SHA256 (and also update lintian)?
[...]
MD5 is considered insecure to collision attacks, but mounting one
would require that the creator of the original file intentionally
pick content that can hash to the same value as some malicious
content (and even that is nontrivial, but let's set that aside for
the moment).
I think Marc Stevens' work on Chosen-Prefix Collisions is of
interest. MD5 is currently around 2^39, which is well within reach
of adversaries.
[...]
Yes, which is the "even that is nontrivial" bit to which I alluded,
wherein I meant a single party constructing two functional Debian
packages which hash to the same MD5 checksum, one of which is
malicious. There may be some tricks that can be played based on
common sections created by some archive implementations and padding
with arbitrary offsets, but when you introduce compression into the
mix I have a feeling it trends toward impractical. (The example with
two X.509 certs is sort of a special case which takes advantage of
nuances of the format itself.)
An attacker constructing anything functional with the same checksum
as an existing package published by someone else is another matter
entirely, and what I expect the typical user misleadingly imagines
when they see MD5 hashes and have a knee-jerk reaction based on the
contextless warnings they've been bombarded with for years about the
insecurity of the algorithm.
Probably the most convincing reason to replace such uses of MD5 is
that we collectively get to stop wasting time answering this same
question over and over and over...