Re: Handling of entropy during boot

To: "Theodore Y. Ts'o" <tytso@mit.edu>
Cc: debian-devel@lists.debian.org
Subject: Re: Handling of entropy during boot
From: Stefan Fritsch <sf@sfritsch.de>
Date: Tue, 8 Jan 2019 10:41:55 +0100 (CET)
Message-id: <[🔎] alpine.DEB.2.20.1901081007340.3795@manul.sfritsch.de>
In-reply-to: <20181224000517.GA5531@mit.edu>
References: <5877331.emXNK5WogZ@k> <20181218191158.GA8974@mit.edu> <1869958.NYsBfqgiqi@k> <20181224000517.GA5531@mit.edu>

On Sun, 23 Dec 2018, Theodore Y. Ts'o wrote:

> On Sun, Dec 23, 2018 at 05:52:31PM +0100, Stefan Fritsch wrote:
> > I think some other questions should be considered first. Did Debian protect 
> > from these attacks in the past? The answer is clearly no. Now, should we break 
> > the systems of those people who keep their random-seed file secret and don't 
> > clone their OS image, in order to offer some protection to other people? This 
> > is really what we need to answer first, and in my opinion, we should try very 
> > hard not to break the systems of those users. And I see no other way than to 
> > credit the random seed file with entropy.
> 
> I don't think this line of reasoning is valid.  Supposed there was a
> horrific security hole, such that 10% of publically available SSH
> hosts had insecurely shared public keys such that were vulnerable to
> being guessed[1].  Cearly, in the past (before we knew about such a
> vulnerability) we did not protect those systems against this attack.
> Does this mean we shouldn't in the future?  I don't think it so
> follows!

If the security issue only affects a small percentage of the installations 
and fixing it means breaking many other installations, then there has to 
be a discussion if we really want fix the issue or if a "don't do that" 
documentation is the better choice.

> [1] Mining your p's and q's: Widespread Weak Keys in Network Devices.
> https://factorable.net

> There is a balancing test that has to go on here.  And quite frankly
> Rasberry PI's are extremely problematic devices from a security
> perspective.  They use a coarse-grained clock, so it's very hard to
> get good entropy out of timing events, and very the hardware that they
> have on them is such that there aren't many events that we can use to
> generate entropy in the first place.

Rasberry PIs were only an example. There are also other systems, including 
old x86 systems, that don't have a HWRNG. Also, there are probably a load 
of x86 VMs that emulate an older CPU due to libvirt misconfiguration and 
don't expose the rdrand cpuid bit. Will the Linux kernel try to detect 
rdrand by detecting the UD exception or does it trust the cpuid bit?

> I'm not sure that it's a great idea to weaken *all* Debian systems to
> the security of Rasberry PI's, including x86 servers and laptops, just
> because one platform has crappy hardware with respect to getting
> secure random numbers.

Systems that don't suffer from blocking on entropy because they have other 
sources of entropy (hwrng, ...) won't have their security reduced because 
the good entropy will still be added to the pool, regardless of the seed 
file being credited or not.

> So perhaps the right answer is we have one default value for certain
> architectures, or maybe classes of devices (e.g., a server-class ARM64
> device is very different from a IOT-style ARM platform).
> 
> > 
> > One could also make it harder for an attacker to regenerate key material from 
> > a system where he knows the seed file. For example, if there is a RTC one could 
> > put the boot time and all serial numbers / MAC addresses that one can find into 
> > an expensive function like PBKDF2 or bcrypt and feed the result to the random 
> > seed. This way, even if the attacker has an approximate knowledge of most of 
> > that information, he would still need to spend quite a bit of computing power 
> > to get all the possible random seeds that could be used.
> 
> We mix things like serial numbers and MAC addresses into the random
> pool already.  Unfortunately, if the attacker can snoop the
> random-seed file, it's likely he or she can simply obtain the MAC
> addresses or serial numbers of the device.

Including the boot time would help, if this was done with sufficient 
granularity, but the boot time can probably leak by stuff like tcp 
timestamps, too. Still, making it more expensive for an attacker to try 
all possible values may still be a good idea.

> > If the number of rounds in the function depends on timing, like do
> > as many rounds as possible in 1 second, things like the load of the
> > VM host and the temperature of the CPU will also play a role in the
> > result. A sha sum of dmesg would probably also help, because it
> > contains a lot of timings that also depend on the load of the VM
> > host.
> 
> We are already mixing timing information into the entropy pool, and to
> the extent that there is randomness there, it is cr editedi
> appropriately.  The problem is that the Rasberry Pi doesn't have a
> fine-grained clock, and there is a lot less entropy from timing events
> than most people might suppose.
> 
> As I said, though; it's one thing for this to be added to the entropy
> pool.  It's quite another for it to be reflected in the random seed
> file.  Today, if the system was booted a year ago, the random seed
> file will not have been updated for the past 12 months.  The last time
> it would have been updated is shortly after the system was first
> booted.  This is **terrible* if you want to assume that we should give
> full credit to the random-seed file --- because entropy means, "not
> known to the adversary".  The adversary can have access to it,
> including, say, when ethernet interrupts may have caused timing events
> because the Rasberry PI only keeps time to 100Hz granularity, and an
> outside attacker can look at the external timing of packets on the
> network, assuming that the timing of network interrupts are actually
> contributing entropy is.... not clear.

That's definitely a problem. The seed file should probably be re-written 
every few hours.

>
> I understand that having Rasberry Pi's take a long time to boot
> because they don't have entropy is frustrating.  But is silently
> assuming they have entropy when someone really determined to reverset
> engineer state of the pool a preferable alternative?

If the system does not ever boot completely because systemd kills daemons 
with a timeout, the system may be bricked if it does not have serial 
console. That is definitely much worse than some danger of someone 
stealing the random seed file. We as Debian should ensure that missing 
entropy will not result in the admin not being able to log into the 
system. If this means that the protection from entropy-file sniffing 
attacker has to wait for buster+1, then so be it.

> If someone is using the prototype and IOT device (remember: 'S' in IOT
> standards for security), maybe it's fine, since IOT devices are
> generally wide open to security problems anyway, so what's one more?
> Just don't put them on *my* home network.  :-)
> 
> But is that *really* the best answer for Debian?   My opinion is "no"....
> 
> At least, let's please not make the security for x86 servers and
> desktops worse just to please Rasberry Pi IOT developers....

This really is not only about IOT devices, but plenty of old x86 servers, 
too.

So, how could we go forward from here. Maybe we could limit the wait for 
entropy to some reasonable value (1 minute? 5 minutes?). This could be 
done by creating a program that does a blocking getrandom but with a 
timeout. If the timeout expires and the seed file has been read 
successfully before, it would then credit the read entropy. This program 
would be added as systemd unit so that services that need entropy can 
depend on it and don't get killed with a timeout. Is this a reasonable 
approach? Or do you (or anyone else) have any better suggestions?

Reply to:

Follow-Ups:
- Re: Handling of entropy during boot
  - From: "Theodore Y. Ts'o" <tytso@mit.edu>

Prev by Date: Re: Rebuilding the entire Debian archive twice on arm64 hardware for fun and proft
Next by Date: Re: DEB_BUILD_OPTIONS vs DEB_BUILD_PROFILES: What is right and what is wrong?
Previous by thread: python-socketio x gevent-socketio
Next by thread: Re: Handling of entropy during boot
Index(es):
- Date
- Thread