[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: speeding up installs



On Fri, Jun 07, 2019 at 07:29:49PM +0200, Adam Borowski wrote:

> I care about two use cases:
> * boxes with HDDs or SD cards
> * datacenter VMs, buildds
[...]
>  No, there's no such thing as a 1-way machine that can
> install a modern distro anymore[3]: oldest machine I own, a non-NX Pentium4,
> is already -j2; when 3 years ago I needed the cheapest possible box with
> • USB, • local storage, • ethernet; it had 4 cores and 512MB RAM.   Non-SMP
> is dead and buried, forget about ever optimizing for that.

Non-SMP is pretty alive when it comes to VM guests. So if you claim you
care about that usecase, please do optimize for that as well.

> * let's not care about power loss during install.  So no fsyncs, and no
>   writing a single byte that's going to be overwritten later.  Do a global
>   sync() only when entering grub-install.

With KVM installs, I usually configure it to use unsafe IO, which
basically has the same effect as eatmydata. If the installation was
succesful, I can switch the IO mode back to something reliable. This
indeed makes a huge difference in install speeds.

> * being able to unpack in parallel also means you don't need to care about
>   order: install can go before apt-download has finished.  This is awesome
>   when your mirror has a slower link than that 10Gb...  We can install
>   package X the moment apt has fetched it even though it's still downloading
>   packages Y and Z.
>   (Nb: what's a good way to know apt is done?  I screen-scrape
>   -oDebug::pkgAcquire looking for "Dequeuing" which is a nasty hack.)

We already know before downloading packages what their dependencies are,
so we can order the download such that the ones with the least
dependencies are downloaded first, and so on. This will allow starting
to install stuff while downloading other packages in a safe way.

> The above is all nice and dandy, but I don't know how to do configure right. 
> It seems that at least some triggers can be parallelized.  man-db is by a
> large margin the biggest offender -- seems it has no dependencies so it's a
> great low-hanging fruit.  Somehow it worked for me even before ldconfig --
> that's probably insane though, so ldconfig should go first.  Both of
> ldconfig and man-db are ordered after all unpacks of unrelated packages have
> finished -- is it possible to do them piecewise?

It might be interesting to create a bootgraph-like chart of the
installation process, to identify the actual bottlenecks and potentials
for parallelization. Maybe we already have such a tool?

> So... any comments so far?  Any hints how to cheat the configure step?

If two packages don't (reverse-)depend on each other in some way, how
safe is it to configure them in parallel?

-- 
Met vriendelijke groet / with kind regards,
      Guus Sliepen <guus@debian.org>

Attachment: signature.asc
Description: PGP signature


Reply to: