Re: archive rebuilds wrt Lucas' victory

To: debian-curiosa@lists.debian.org
Subject: Re: archive rebuilds wrt Lucas' victory
From: Goswin von Brederlow <goswin-v-b@web.de>
Date: Thu, 18 Apr 2013 10:21:08 +0200
Message-id: <[🔎] 20130418082108.GA24658@frosties>
In-reply-to: <[🔎] 20130416202220.GA29529@angband.pl>
References: <E1URAMW-0000L2-HX@vento.debian.org> <[🔎] 20130414223043.GA20827@angband.pl> <[🔎] 20130416092943.GA23900@frosties> <[🔎] 20130416202220.GA29529@angband.pl>

On Tue, Apr 16, 2013 at 10:22:20PM +0200, Adam Borowski wrote:
> On Tue, Apr 16, 2013 at 11:29:43AM +0200, Goswin von Brederlow wrote:
> > On Mon, Apr 15, 2013 at 12:30:43AM +0200, Adam Borowski wrote:
> > > Too bad, I see what seems to be most of the time being spent in dpkg
> > > installing dependencies -- how could this be avoided?  One of ideas would
> > > be to reformat as btrfs (+eatmydata) and find some kind of a tree of
> > > packages with similar build-depends, snapshotting nodes of the tree to
> > > quickly reset to a wanted state
> > 
> > I think snapshoting is a good idea there. Or rather forking the
> > filesystem. Say you have 2 packages:
> > 
> > Package: A
> > Build-Depends: X, Y
> > 
> > Package: B
> > Build-Depends: X, Z
> > 
> > You would start with the bare build chroot and install X. Then you
> > create snapshots SA and SB from that. In SA you install Y and in SB
> > you install Z. Now both packages can be built.
> 
> So you would include intermediate states as nodes in the graph as well? 
> Interesting -- this could indeed optimize cases like that, at the cost of
> making the problem a good deal harder algorithmically.
> 
> > BUT:
> > 
> > - Easy with 2 packages. But how do you do that with 30000?
> 
> You mean, an algorithmical challenge?  In our kind of crowd?  That's the fun
> stuff!

Anything exponential will not work. And for O(n^c) the c must be
rather small to still find a solution in time. Fun stuff indeed. After
all, if we search 4 weeks for an optimal solution we might as well
just build everything like now and be quicker.

> If we reduce the problem by two simplifications:
> 
> * can snapshot only before building a package (no intermediate states)

I don't think that is a good simplification. The chance that for two
packges like A and B there is a third package C that only
Build-Depends on X is rather small. And then you wouldn't get a common
node for A and B.

> * the cost of purging a package is same as installing it

That somewhat fixes what the first one broke. Since now B can be a
child of A by purging Y and installing Z. Still wastefull though.

It makes the graph a lot smaller though.

> a solution is to find a minimal spanning tree, possibly with a
> constraint on the tree's height.  And with the graph not being malicious,
> I have a hunch the tree would behave nicely, not requiring too many
> snapshots (random graphs tend to produce short trees).

Luckily minimal spanning tree is well researched. :)

Note: the graph would be directed and should have weights according to
the (estimated) complexity of installing/purging a package.

> The full problem may take a bit more thinking.
> 
> > - Y and Z may both depend on W. So initialy we should have
> >   installed X and W.
> > - Package C may Build-Conflicts: X but depend on most of the stuff X
> >   depends on. So taking the filesystem with X installed and purging X
> >   will be faster than starting from scratch.
> 
> Ie, edges that purge should have a lesser cost than edges that install.
> 
> > - Doing multiple apt/dpkg runs is more expensive than a combined one.
> >   A single run will save startup time and triggers.
> 
> Again, parameters to the edge cost function.
> 
> > - Could we install packages without running triggers and only trigger
> >   them at the end of each chain? Or somewhere in the middle?
> 
> Could be worth looking at.  Not sure how many triggers can work
> incrementally, and how many rebuild everything every time like man-db.
> Of course, this is moot if we snapshot only before package build.

Even if the trigger is incremental we don't loose anything if delay
running the trigger. It will do more work for that later run but not
more than the individual runs put together. On the other hand non
incremental triggers will add up to a lot more.

> > - There will be multiple ways to build the tree. We might install U
> >   first and then V or V first and then U. Also we might have to install
> >   V in multiple branches and V can not be installed in a commong root.
> >   Unless we install V in a common root and then uninstall V again for a
> >   subtree. This probably needs a heuristic for how long installing (or
> >   uninstalling) a package takes. Package size will be a major factor
> >   but postinst scripts can take a long time to run (update-texmf anyone?).
> 
> What about something akin to: log(size) + size/X ?  For smallish packages,
> dpkg churn is the dominant factor, for big ones it's actual I/O and
> registering individual files.

That would be something to tune experimentally I guess. Possible
factors I can see would be:

- fixed cost for apt/dpkg startup time (or cost relative to number of
  installed packages/files, i.e. database size)
- number of dpkg runs needed for a set of packages (multiplier for the first)
- package size in bytes and number of files
- cost for triggers
- cost for preinst/postinst/prerm/postrm scripts

The last two would be harder to get. The rest all comes from the
Packages files.

> > - Build chroots, even as snapshots, take space. You can only have so
> >   many of them in parallel. A depth first traversal would be best there.
> 
> I guess the tree won't be tall enough for disk space to be a concern.

No, the height won't be a problem. Ignoring package purges even a
height of 100 should be trivial. Each node would only add packages and
a new dpkg DB. So the space needed would be roughly |size of leaf| +
100 * |size of dpkg DB|.

But a snapshot should only be deleted once all its children have been
built. If you build package in order of dependencies (bootstrapping)
then you would have to keep a loot of snapshots around untill you
eventually reach a leaf package and can delete the snapshot again. And
those snapshots will be hugely different.

> >   Building packages against locally build packages (instead of the
> >   existing official ones) gives a better test. But that would require a
> >   more width first ordering. Some compromise between the two would be
> >   needed.
> 
> Bootstrapping is a whole new can of worms, and I'm afraid I haven't followed
> Wookey and co closely enough.

Yeah, I wouldn't enforce that. Building against existing packages is
fine for an archive wide rebuild. But if a package has already been
build it could be used to test that the package actually works. So I
would prefer a build order where more local packages can used. But
that should be balanced against the number of snapshots (and their
size) required to build everything.

> > With multiple cores it is better run multiple builds in parallel,
> > given enough ram, than trying to build a single package on multiple
> > cores. Most packages don't support parallel building (even if they
> > could) and some break if you force the issue.
> 
> And others spend a good deal of the time on a single core, even if they
> allow concurrency in a part.  Watching libreoffice, it was at 25%
> utilization more time than at 100%.
> 
> "Given enough ram" is another problem: a vast majority of packages need
> quite little, yet some can be pretty hungry.  I don't think we have
> per-package data about memory needed, do we?  In my case, libreoffice
> -j4 OOMed near the end.

Some builds have an exclude list for HUGE packages because they are
known to swap to death (or take considerably longer than others). So I
guess we could come up with a list of special packages that must not
be build in parallel with others.

> The box has only 2GB, non-extensible.  Someone had a bright idea of shipping
> a kernel with CONFIG_SWAP unset; the source for all needed out-of-tree
> drivers is there, but I need to figure out why that uboot thingy doesn't
> work, without being able to view the screen.

But think of a nice ARM big box with 192 cores and 192GB ram. Now that
would be something to do archive wide rebuilds on.

> > Now if you have multiple builds that are based around the same snapshot
> > then common header files and libraries will be cached only once.  So cache
> > locality (and therefor efficiency) should increase.
> 
> Bad news: both btrfs and lvm (and I'm 99% sure zfs) fail to tell the VM
> system that a page is shared on the disk and thus can be shared in the
> memory as well.  This means, snapshots and even cp --reflink won't utilize
> the similarity.  This limitation seems quite fundamental on lvm; in btrfs
> thanks to rampant layering violations it's fixable but hasn't been fixed
> yet.  Somewhere among vserver patches there's IUNLINK -- a file attribute
> that duplicates hardlinked files when you try to open them for writing,
> that's the only cow I know that lets the page cache work.

Ugh? I would think that the page would be in the block cache,
addressed by device and block offset. In the case of lvm I can see how
you would get the same block twice since you have different devices.
But why btrfs?

> One could run builds with exactly the same Build-Depends together, but as
> they would need to run on the same mount, this doesn't seem to be worth the
> hassle:
> grep ^Build-Depends [???]debian_dists_unstable_main_source_Sources|sort|uniq|wc -l
> 15564
> grep ^Package: [???]debian_dists_unstable_main_source_Sources|sort|uniq|wc -l
> 18555

Looks likes 1/6th of all packages. I wouldn't call that ignorable.

MfG
	Goswin

Reply to:

References:
- archive rebuilds wrt Lucas' victory
  - From: Adam Borowski <kilobyte@angband.pl>
- Re: archive rebuilds wrt Lucas' victory
  - From: Goswin von Brederlow <goswin-v-b@web.de>
- Re: archive rebuilds wrt Lucas' victory
  - From: Adam Borowski <kilobyte@angband.pl>

Prev by Date: Re: archive rebuilds wrt Lucas' victory
Next by Date: Re: FINAL release update
Previous by thread: Re: archive rebuilds wrt Lucas' victory
Next by thread: Re: bits from the DPL: March-April 2013
Index(es):
- Date
- Thread