[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#977994: apt: Output from sandboxed methods should not be trusted



On 12/31/20 6:03 AM, David Kalnischkies wrote:
> On Wed, Dec 30, 2020 at 09:32:32PM -0500, Demi M. Obenour wrote:
>> That is true.  Nevertheless, if we are going to put in the work to
>> confine the methods, we should also make sure they cannot escape
>> their confinement.
> 
> The confinement can never be perfect given that we always have to let
> data escape to actually work with it. Its also fatal to go into
> discussion with an attitude of demanding perfectism as nothing else
> registers as improvement over the status quo.
> 
> Is APT perfect? Certainly not. Was it a good idea to not run the http
> method as root? It surely was work, but we hope it was an improvement
> even if not bulletproof and hence not to your liking. Can we do more?
> Yes, if time and energy permits.

Indeed.  Running the HTTP method not as root was absolutely an
improvement, and perfection is not possible.

>> 1. libapt-pkg spawns its methods as normal.
>>
>> 2. When libapt-pkg requests that a file be downloaded, it creates
>>    a socketpair and passes one end to the method using SCM_RIGHTS.
>>    libapt-pkg reads from the other end in a background thread.
>>
>> 3. The method downloads the file as normal, but instead of writing it
>>    to a file, it writes it to the file descriptor it received above.
>>    libapt-pkg reads this data, hashes it, and writes it to a temporary
>>    file.
> 
> You fail to mention who is (e.g. in the case of https) creating the
> connection to the server, keeping it alive, handling handshakes (https),
> reading/running proxy configuration and pipelining(http1.1).

The method would be responsible for all of that.

> The later can actually cause the method to change the file it is working
> with if it detects a server has messed up pipelining. I presume
> a potential http2 can do that on purpose due to their streams.

I was not aware of that.  

> Other things to consider are partial downloads (one optional input file),
> pdiff patches (potentially infinite amount of input files) and methods
> who do not produce a verifiable stream of data as output (e.g. gpgv).
> 
> They all produce countless different messages accompanying the raw data
> though, like progress reports and error messages – both based at least
> tangibly if not directly on (untrusted) data from the server.
> 
> You also fail to mention how you handle e.g. redirects, which was the
> start of this whole CVE message passing ordeal.
> 
> 
>> 4. The method returns its response and closes the file descriptor.
> 
> Oh, so you are passing a message from the method to libapt. How is that
> message secured given it is data coming from a source (= the method) you
> do not trust?

I am assuming that the message parser in libapt is trusted.

>> 5. libapt-pkg reads the response.  It reads and closes the file
>>    descriptor, and checks the hash against the signed manifest.
> 
> The size of unprocessed data in a pipe is not infinite. You make it
> sound like you could download a 4 GB deb to an open file descriptor and
> leave it there until the process is done. In reality you have to
> constantly process that data (= calculate hash) and store on disk long
> before the process is done.

100% agree.  libapt would presumably do this on a background thread.

> (Also, just so we be clear: There is no "the" hash. APT supports
>  multiple, the involved methods support multiple, the repository
>  supports multiple, and these (at least) three sets hopefully have
>  a non-empty intersection.)

Poor phrasing on my part.

>>    If the file being downloaded is a diff, then the hash of the diff
>>    is checked.
> 
> As said, e.g. the method "store" takes e.g. a Contents.xz (which passed
> hashes check), extracts it (= subject to e.g. an extraction bomb) and
> produces a Contents.lz4. The result can not be verified. The
> intermediate product (= the uncompressed data) only this method sees can.
> Or are you saying that because the file passed hashes once, its
> absolutely no longer hostile (I mentioned extraction bomb for a reason)
> and no mistakes could have been made?

Correct.  Once a strong hash of the file has been verified, I assume
that it can be trusted.  So I would run “store” as a separate
user than the untrusted “http” method.

Could mistakes have been made?  Absolutely.  But I don’t consider
such mistakes to be any more likely than they are now.

> Similar, why only check the hash of the diff if we can and should also
> check the hash of the file it produced (potentially as an intermediate
> as its on disk storage is potentially again a compressed file).
> 
> With that train of thought we can just use https and be done with it.
> 
> 
>> This is safe.  If the method returns malicious data, libapt-pkg will
>> detect it.  The hashing and writing can be performed in a background
>> thread, so there is very little performance penalty.
> 
> God forbid that someone runs `sed -i -e 's#background thread#method#'` on
> this sentence. The word "thread" doesn't make it magically more secure,
> actually less as a thread has a lot of shared context.

What I meant is that the hashing and writing (which I consider to be
trusted) should happen in a separate process than downloading files
from the Internet.  The code that does the hashing is in the trusted
computing base anyway, and the likelyhood of a vulnerability is (IMO)
very small.

Sincerely,

Demi Obenour

Attachment: OpenPGP_signature
Description: OpenPGP digital signature


Reply to: