[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

speeding up installs



Hi!
Package install times are quite nasty.  For example, a full GUI install can
take[1] 1.5 hours on spinning rust, and several minutes even on raid0 of Optane
disks -- you can't get much faster without operating entirely in memory[2].

There are two massive recent improvements:
* eatmydata helps a lot, and eating a power-lossed install is not a loss
* mmdebstrap can speed up the debootstrap stage by a factor of 3-6

But d-i doesn't use either, debootstrap is small compared to installing
actual tasks/b-deps afterwards, and there's at least a couple of orders of
magnitude of speedup possible.

But you'd say: I have a fast NVMe disk, slow network, and don't install more
often than once in a few years.  Then yeah, you don't need a d-i speedup.  I
care about two use cases:
* boxes with HDDs or SD cards
* datacenter VMs, buildds

It seems that the hard minimum is around 1 second on modern hardware.  You
can't unpack the full set of .xz debs faster than 0.75s assuming full
parallelism -- which is impossible as firefox.deb itself takes 3s (with
other CPU cores loaded).  We can cheat here by splitting or repacking such
big .debs with zstd.  But that's not all -- the .debs have to get fetched
from the network (750MB in a second calls for 10Gb network at minimum). 
Then configured.

Writeout is not a problem: the full test install is 2.5GB unpacked, and you
don't need to persist that immediately.  A desktop install can writeout
while the user answers installer questions, a buildd doesn't care about
durability, a VM can do usable stuff while the disk is writing.

You don't need that big a CPU: while I benchmarked the preliminary code on a
-j64 2990WX, the actual utilization was less than 1/4 despite the task being
highly parallel.  This CPU chokes on its limited memory bandwidth (64MB of
L3 cache hardly counts for big xz decompression), you want something with
fewer, faster cores.

And it's slowest machines like 4-core boards that actually benefit the most
in absolute terms.  No, there's no such thing as a 1-way machine that can
install a modern distro anymore[3]: oldest machine I own, a non-NX Pentium4,
is already -j2; when 3 years ago I needed the cheapest possible box with
• USB, • local storage, • ethernet; it had 4 cores and 512MB RAM.   Non-SMP
is dead and buried, forget about ever optimizing for that.

Test set: buster's task-xfce-desktop.  That's 750MB of .debs, 2.5GB result.


So let's see if we can approach this theoretical limit.

So far I came up with the following:

* let's not care about power loss during install.  So no fsyncs, and no
  writing a single byte that's going to be overwritten later.  Do a global
  sync() only when entering grub-install.
* almost all Pre-Depends and preinsts care only about upgrades; on a
  clean-slate install you can ignore them and at most fix-up later
* dpkg-diverts can be a problem but going yolo seems to work for me so far
  (not sure if all cases can be fixed-up after the fact -- dash can)
* being able to unpack in parallel also means you don't need to care about
  order: install can go before apt-download has finished.  This is awesome
  when your mirror has a slower link than that 10Gb...  We can install
  package X the moment apt has fetched it even though it's still downloading
  packages Y and Z.
  (Nb: what's a good way to know apt is done?  I screen-scrape
  -oDebug::pkgAcquire looking for "Dequeuing" which is a nasty hack.)

The above is all nice and dandy, but I don't know how to do configure right. 
It seems that at least some triggers can be parallelized.  man-db is by a
large margin the biggest offender -- seems it has no dependencies so it's a
great low-hanging fruit.  Somehow it worked for me even before ldconfig --
that's probably insane though, so ldconfig should go first.  Both of
ldconfig and man-db are ordered after all unpacks of unrelated packages have
finished -- is it possible to do them piecewise?

I hardly looked at other postinsts yet, I wonder how they can be elided or
fast-tracked.

My dependency graph so far:

apt-update
 |
 +>------------------
 |                   \
apt -s install    apt-cache dumpavail
 |                   /
 +<------------------
 |
stat(if .debs are here)
 |\
 | +----------+----...
 | |          |
 | unpack 1   unpack 2 (.debs that were already on disk)
 | \----------+
 |             \---------\
apt download             |
 Finished 3 -> unpack 3 -+
 Finished 4 -> unpack 4 -+
 Finished 5 -> unpack 5 -+
                         |
                   (unpack complete)
                         |
                     ldconfig
                    /      \
               man-db     write dpkg's status
                  |         |
                  |       dpkg --configure -a (fully serial...)
                  |        /
                  +-------/
                  |
                Done!

One other issue is that the whole plan needs to be known before starting.  So
no running in-target tasksel, asking whether you want popcon, etc.  But
that'd actually fix another of my gripes about d-i.


So... any comments so far?  Any hints how to cheat the configure step?


Meow!

[1]. Both times on btrfs, which interacts especially badly with fsync spam
     dpkg does.
[2]. There's a hidden meaning here.
[3]. Counting only stuff you can buy new; heavily embedded doesn't run
     Debian but specially crafted distros.
-- 
⢀⣴⠾⠻⢶⣦⠀ 
⣾⠁⢠⠒⠀⣿⡁ Ivan was a worldly man: born in St. Petersburg, raised in
⢿⡄⠘⠷⠚⠋⠀ Petrograd, lived most of his life in Leningrad, then returned
⠈⠳⣄⠀⠀⠀⠀ to the city of his birth to die.


Reply to: