[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: archive rebuilds wrt Lucas' victory



On Mon, Apr 15, 2013 at 12:30:43AM +0200, Adam Borowski wrote:
> Too bad, I see what seems to be most of the time being spent in dpkg
> installing dependencies -- how could this be avoided?  One of ideas would
> be to reformat as btrfs (+eatmydata) and find some kind of a tree of
> packages with similar build-depends, snapshotting nodes of the tree to
> quickly reset to a wanted state -- but I guess you guys have some kind
> of a solution already.

I think snapshoting is a good idea there. Or rather forking the
filesystem. Say you have 2 packages:

Package: A
Build-Depends: X, Y

Package: B
Build-Depends: X, Z

You would start with the bare build chroot and install X. Then you
create snapshots SA and SB from that. In SA you install Y and in SB
you install Z. Now both packages can be built.

BUT:

- Easy with 2 packages. But how do you do that with 30000?
- Y and Z may both depend on W. So initialy we should have
  installed X and W.
- Package C may Build-Conflicts: X but depend on most of the stuff X
  depends on. So taking the filesystem with X installed and purging X
  will be faster than starting from scratch.
- Doing multiple apt/dpkg runs is more expensive than a combined one.
  A single run will save startup time and triggers.
- Could we install packages without running triggers and only trigger
  them at the end of each chain? Or somewhere in the middle?
- There will be multiple ways to build the tree. We might install U
  first and then V or V first and then U. Also we might have to install
  V in multiple branches and V can not be installed in a commong root.
  Unless we install V in a common root and then uninstall V again for a
  subtree. This probably needs a heuristic for how long installing (or
  uninstalling) a package takes. Package size will be a major factor
  but postinst scripts can take a long time to run (update-texmf anyone?).
- Build chroots, even as snapshots, take space. You can only have so
  many of them in parallel. A depth first traversal would be best there.
  Building packages against locally build packages (instead of the
  existing official ones) gives a better test. But that would require a
  more width first ordering. Some compromise between the two would be
  needed.

Note:

With multiple cores it is better run multiple builds in parallel,
given enough ram, than trying to build a single package on multiple
cores. Most packages don't support parallel building (even if they
could) and some break if you force the issue. Now if you have multiple
builds that are based around the same snapshot then common header
files and libraries will be cached only once. So cache locality (and
therefor efficiency) should increase.


So I'm looking forward to someone taking up this idea and implementing
an algorithm that will sort sources into a tree structure for
installing and snapshoting / forking the filesystem at each node.
Optimized for reducing the number of dpkg runs, snapshots needed and
installing the same package in multiple branches.

MfG
	Goswin


Reply to: