On 12/31/20 6:03 AM, David Kalnischkies wrote: > On Wed, Dec 30, 2020 at 09:32:32PM -0500, Demi M. Obenour wrote: >> That is true. Nevertheless, if we are going to put in the work to >> confine the methods, we should also make sure they cannot escape >> their confinement. > > The confinement can never be perfect given that we always have to let > data escape to actually work with it. Its also fatal to go into > discussion with an attitude of demanding perfectism as nothing else > registers as improvement over the status quo. > > Is APT perfect? Certainly not. Was it a good idea to not run the http > method as root? It surely was work, but we hope it was an improvement > even if not bulletproof and hence not to your liking. Can we do more? > Yes, if time and energy permits. Indeed. Running the HTTP method not as root was absolutely an improvement, and perfection is not possible. >> 1. libapt-pkg spawns its methods as normal. >> >> 2. When libapt-pkg requests that a file be downloaded, it creates >> a socketpair and passes one end to the method using SCM_RIGHTS. >> libapt-pkg reads from the other end in a background thread. >> >> 3. The method downloads the file as normal, but instead of writing it >> to a file, it writes it to the file descriptor it received above. >> libapt-pkg reads this data, hashes it, and writes it to a temporary >> file. > > You fail to mention who is (e.g. in the case of https) creating the > connection to the server, keeping it alive, handling handshakes (https), > reading/running proxy configuration and pipelining(http1.1). The method would be responsible for all of that. > The later can actually cause the method to change the file it is working > with if it detects a server has messed up pipelining. I presume > a potential http2 can do that on purpose due to their streams. I was not aware of that. > Other things to consider are partial downloads (one optional input file), > pdiff patches (potentially infinite amount of input files) and methods > who do not produce a verifiable stream of data as output (e.g. gpgv). > > They all produce countless different messages accompanying the raw data > though, like progress reports and error messages – both based at least > tangibly if not directly on (untrusted) data from the server. > > You also fail to mention how you handle e.g. redirects, which was the > start of this whole CVE message passing ordeal. > > >> 4. The method returns its response and closes the file descriptor. > > Oh, so you are passing a message from the method to libapt. How is that > message secured given it is data coming from a source (= the method) you > do not trust? I am assuming that the message parser in libapt is trusted. >> 5. libapt-pkg reads the response. It reads and closes the file >> descriptor, and checks the hash against the signed manifest. > > The size of unprocessed data in a pipe is not infinite. You make it > sound like you could download a 4 GB deb to an open file descriptor and > leave it there until the process is done. In reality you have to > constantly process that data (= calculate hash) and store on disk long > before the process is done. 100% agree. libapt would presumably do this on a background thread. > (Also, just so we be clear: There is no "the" hash. APT supports > multiple, the involved methods support multiple, the repository > supports multiple, and these (at least) three sets hopefully have > a non-empty intersection.) Poor phrasing on my part. >> If the file being downloaded is a diff, then the hash of the diff >> is checked. > > As said, e.g. the method "store" takes e.g. a Contents.xz (which passed > hashes check), extracts it (= subject to e.g. an extraction bomb) and > produces a Contents.lz4. The result can not be verified. The > intermediate product (= the uncompressed data) only this method sees can. > Or are you saying that because the file passed hashes once, its > absolutely no longer hostile (I mentioned extraction bomb for a reason) > and no mistakes could have been made? Correct. Once a strong hash of the file has been verified, I assume that it can be trusted. So I would run “store” as a separate user than the untrusted “http” method. Could mistakes have been made? Absolutely. But I don’t consider such mistakes to be any more likely than they are now. > Similar, why only check the hash of the diff if we can and should also > check the hash of the file it produced (potentially as an intermediate > as its on disk storage is potentially again a compressed file). > > With that train of thought we can just use https and be done with it. > > >> This is safe. If the method returns malicious data, libapt-pkg will >> detect it. The hashing and writing can be performed in a background >> thread, so there is very little performance penalty. > > God forbid that someone runs `sed -i -e 's#background thread#method#'` on > this sentence. The word "thread" doesn't make it magically more secure, > actually less as a thread has a lot of shared context. What I meant is that the hashing and writing (which I consider to be trusted) should happen in a separate process than downloading files from the Internet. The code that does the hashing is in the trusted computing base anyway, and the likelyhood of a vulnerability is (IMO) very small. Sincerely, Demi Obenour
Attachment:
OpenPGP_signature
Description: OpenPGP digital signature