[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [PATCH] libdpkg: Use OpenSSL for hashing.



Hi!

On Sun, 2023-07-02 at 11:44:26 +0200, Sebastian Andrzej Siewior wrote:
> On 2023-07-01 13:16:17 [+0200], Guillem Jover wrote:
> > On Sat, 2023-07-01 at 00:03:53 +0200, Sebastian Andrzej Siewior wrote:
> > > Would it be acceptable to to switch it?
> > 
> > I gave the reasoning for switching from the embedded MD5
> > implementation to libmd at
> > <https://lists.debian.org/debian-devel/2022/07/msg00045.html>, and I
> > think it still holds. (This would also imply pulling OpenSSL or any
> > other crypto library into the current essential-set, which are not
> > small compared to the minimal libmd library.)
> 
> I can't comment on that. libssl is pretty much part of every
> installation. libgcrypt is part of every debootstrap due to gpg.
> The essential-set seems to be something different and looking at libmd's
> size of 60KiB vs libssl 6MiB it is hard to argue for libssl :)
> As I said, can't comment on that but thanks for the background.

The more important issue, which I mentioned on the linked post is that
AFAIK these crypto libraries can end up refusing to provide
implementations for things that have been disabled due to some policy
(for security reasons on unsafe algos, or stuff like FIPS), which is
not really what is desired here.

> It popped up on my side due to popping in perf testing ;)

I've pondered about adding assembler optimized versions of some of
these functions to libmd, but the complexity didn't seem worth it.
But I think this could be revisited.

> > This is supposedly documented in the deb-md5sums(5) man page, and
> > perhaps should be made more clear in the man page documenting the
> > «dpkg -V» option, so I'm happy to try to clarify these.
> 
> ach. Years ago I used something different for it. Good to know that dpkg
> itself supports it.

I've queued the attached patch, hoping that might help a bit with the
docs.

> > With the fsys metadata work, it will be easier to add new digests, but
> > that implies an increase in db size or control members in .deb, so I'm
> > not sure whether it's really worth it. There are people that want to
> > also include per-file signatures (such as IMA stuff) in the mtree
> > metadata in the .deb, so that would cover the security side of things,
> > but that would go counter to reproducibility, so I'm not seeing that
> > happening easily, and I expect there will probably be concerns about
> > lock-in and similar.
> 
> Oh I see. So based on what I read, it is just a checksum kind of thing
> so xxh128 would be a perfect replacement. But I do understand that you
> need to maintain things and adding an additional digest means adding and
> keeping the older one for compatibility reasons.

Yes. It's good though that there are xxhsum CLI tools so that these can
easily interoperate. If adding something new though, an advantage of
using a SHA-2 variant, for example, is that those tools are provided
in coreutils which is also part of the essential set.

> Since it popped on my perf testing, do we need to verify the md5sum
> during installation? I tried installation of firefox (since it is a
> little big) with libmd, openssl and then telling dpkg to just do nothing
> and compare the runtime. I didn't do that because the installation
> process involved man-db taking some time and I was worried that it might
> fiddle with the results.

You should be able to disable man-db processing with:

  $ echo 'man-db man-db/auto-update boolean false' | debconf-set-selections

On mechanical disks I'd expect fsync()/sync_file_range()/etc to
dominate, on SSD disks I'd expect the decompression to dominate. But
I might be wrong, so if you feel like digging into this and get some
numbers that might shed some light. And perhaps adding asm optimized
variants to libmd might just do it?

> Then I tried "dpkg-deb -x" but didn't see md5
> in log so it seems to be skipped.

Yeah, you should think about «dpkg-deb -x» to be closer to tar, than
dpkg in that sense, it will skip many of the actions and checks done
by the latter.

> If adding a fast replacement is difficult could we skip doing the md5
> check during installation?

Adding new digests should not be hard, it's more whether it's worth
it, increased db space usage, potential false sense of security, and
what it pulls in, and availability etc. For multiarch refcounting we
must do the digests to make sure whether they are the same files. For
unpacking, if there is no md5sums we should do these too to have the
digests around. We have done the digests for all other files on
unpacking even if there was an md5sums file because sometimes packages
contained incomplete or out-of-sync information in the md5sums files. :/

Thanks,
Guillem
From 03c1cfe858b83db8f02344f095ff4966c1313891 Mon Sep 17 00:00:00 2001
From: Guillem Jover <guillem@debian.org>
Date: Mon, 3 Jul 2023 23:18:13 +0200
Subject: [PATCH] man: Clarify that the md5sums checks as integrity and not
 security checks

This is hinted in the text and in deb-md5sums(5) man page, but it's
worth making this point very clear least someone might incorrectly
assume they can verify their system is fine from attacks or malicious
modifications.

Prompted-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
---
 man/dpkg.pod | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/man/dpkg.pod b/man/dpkg.pod
index 20112591c..b7d056fab 100644
--- a/man/dpkg.pod
+++ b/man/dpkg.pod
@@ -338,6 +338,8 @@ of the file contents against the stored value in the files database.
 It will only get checked
 if the database contains the file md5sum. To check for any missing
 metadata in the database, the B<--audit> command can be used.
+This is only an integrity check and should not be considered as any
+kind of security verification.
 
 The output format is selectable with the B<--verify-format>
 option, which by default uses the B<rpm> format, but that might
@@ -1083,6 +1085,8 @@ information available.
 =item 3 ‘B<5>’
 
 The digest check failed, which means the file contents have changed.
+This is only an integrity check and should not be considered as any
+kind of security verification.
 
 =item 4-9 ‘B<?>’
 
-- 
2.40.1


Reply to: