[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Comments on live-build, vmdebootstrap, bootstrap-vz, and live-wrapper



Please note the follow-up to the cloud list.

Hi.

I recently finished putting together a custom image of what amounts to
stretch for work.  Today, this is mostly not a cloud image, although
that's expected to change for our future customer deployments.

For my previous job I did maintain a cloud image using live-build
through the wheezy and jessie cycles.

In this discussion I'm focusing on the needs of custom images.  We've
all agreed that a robust and flexible approach for custom images is an
important goal of our cloud (and presumably live) efforts.
My personal belief is that we should use the same tools for official
images as we recommend for custom images.  Back in the days when I cared
about sets of CDs I always found it frustrating that the debian-cd
packages could make official images, but you were better off with other
approaches for custom cd sets.  There would be unfortunate and undesired
differences introduced because you weren't using debian-cd, but it was
never worth the pain of hacking  debian-cd for custom sets.

As a user, I'd like to contrast this significantly with live images,
where we have (so far) used mostly the same tooling for official and
unofficial images.
It feels more polished and the approach is better.

I'd like to get a couple of things out of the way first.
                              live-wrapper
I think live-wrapper is too immature to evaluate.  I tried to evaluate
it as a replacement for live-build, but quickly concluded that it's too
early in the live-wrapper development cycle for me to be able to look
at.  The documentation of live-wrapper 0.3,  which I looked at, was very
sparse.  It requires a lot of the directory structure where you run it
without clearly documenting that.  There's some sort of customization
script that is needed, but it's unclear how that works.

Live-wrapper is missing functionality that was critical to me in my use
of live-build.  It does not support live-installer.  It does not support
including d-i and a live system on the same CD.  It didn't appear to
support easy customization of the bootloader configuration.


I think it's great that people are looking at something built on top of
vmdebootstrap.  I hope that effort continues, and when it's more mature
I'd be happy to take another look.

I'd advise taking a careful look at my comments on vmdebootstrap, as I
think several of them may have implications for live-wrapper.

                              bootstrap-vz

Bootstrap-vz is a collection of python for building cloud and VM
images.  I read all the documentation, examined some chunks of the code,
but did not end up using the product to build an image.

Up front I'll say I was disappointed that the code is python2.7 not
python3.  It actually does matter.  As I build a wider collection of
idiomatic python3 within my own domain, it would be valuable to be able
to use that while customizing the images I develop.  Making it
impossible for me to use my code and the bootstrap-vz libraries and
plugins in the same process is architecturally disappointing.

The strength of bootstrap-vz is that it has a plugin architecture, and
supports tweaking aspects of the image for  the environment.  There's an
assumption that things will be different between ec2, gce, virtualbox,
kvm, etc.  You have to pick one of these at a fairly basic level, and it
significantly affects the code path that will be used.

There are a bunch of plugins that accomplish things.  The code supports
images with and without partition tables.  It supports images backed by
EBS volumes, raw devices, and even things like vdi for VirtualBox.

Different virtualization layers can plug in plugins.  For example the
kvm layer plugs in support for virtio.  This is both a blessing and a
curse.  Virtio is a great example: VirtualBox supports it too, and it's
unclear why that plugin would be introduced at that level in the code.

Bootstrap-vz failed to meet my needs in two important ways.  First, I
found it too difficult to audit how an image I would create with
bootstrap-vz  would iffer from what I'd get debootstrapping something by
hand.  That is, I found that I could not trust the changes bootstrap-vz
might introduce for me.  I'd end up needing to  read most of the code to
evaluate this, because it seems like a major goal of the design is to
make those sorts of tweaks easy.

Secondly, with some irony, I realized that it would be annoying for me
to add my own customizations to the image.  Bootstrap-vz makes it easy
for bootstrap-vz developers to add new tweaks, but makes it challenging
for custom image builders who don't want to hack on the bootstrap-vz
sources.  I really want something like the vmdebootstrap customization
script where I can run a chunk of my shell or Python code at appropriate
points in the process.  Secondly, even if I did want to write
vmdeboothstrap-style tasks and plugins, it looks like getting them
integrated might be a bit tricky.  Bootstrap-vz uses a lot of plugins,
but the plugin consumers seem to know way too much about the plugins
they consume.   Take a look at bootstrapvz/providers/kvm/__init__.py to
see what I'm talking about.  That file knows about the kvm-specific
tasks; it integrates them into the flow, knows about their
configuration, etc.  Instead I'd hoped for an architecture where the
consumer might sometimes be responsible for loading the plugin, but the
integration of the plugin into the process would be accomplished by the
plugin registering itself.  That is, in the mentioned __init__.py file,
the import statements seem fine, but the parent module knowing so much
about the child module seems undesirable in a plugin design.

While I did not use Bootstrap-vz, I want to call out a bunch of
excellent properties.  It does have a plugin architecture.  It
demonstrates thoughtful enough design that you can integrate new
programatic bits into a structure, sharing code, but using a somewhat
declarative approach rather than a purely procedural approach.  That is,
the objects and plugins seem used to relatively good effect.  While i
complained about lack of extensibility for out-of-tree users,
extensibility for in-tree developers seems to be a huge strength.

Also, the flexibility is a huge strength.  To the extent we need to
tweak things based on environment, provider and the like, bootstrap-vz
is good.

Finally, in terms of the resulting image, the manifest is purely
declarative.  I chafe at that building custom images, but for official
images that's probably a huge strength.

                               Live Build

Live Build is the most comprehensive image building tool that I've
used.  It has very good documentation compared to the other tools.  It
supports a rich declarative format: I can add additional archives, keys,
packages and the like.  It also supports hooks so I can make
customizations as needed.

It has caching infrastructure.  I found that performance with live-build
was much better than vmdebootstrap.

Live Build has critical functionality that is missing from other
solutions.  I talked about the aspects of that functionality that relate
to live systems under live-wrapper, the only other tool I looked at that
tries to handle live systems directly.  I'll talk about the other
missing functionality under vmdebootstrap.

Unfortunately, Live Build has some significant disadvantages.  It's
written in shell.  It is a triumph of modular shell programming.  I
found that given time, I was able to understand an debug it: given how
complex the shell scripts are, that says something very positive.  I
found that even when some of what it wanted to do wasn't right for my
needs, code reuse was possible.  Between the phase commands, function
libraries and similar, lots of thought was put into modularity.

However, triumph of modular shell programming though it is, it IS ALL
SHELL.  There's only so modular a shell script can be.  There weren't
really objects—and while yes, I know that you can do object-oriented
shell, it doesn't make things better.

Live Build doesn't do a good job at things that aren't live systems.
You can do some things by selecting plain as the root filesystem, and
hdd as the image type, but I seem to recall that doesn't end up
working.  I ended up using only part of the typical chain (lb
bootstrap&&lb chroot rather than lb build, then writing my own short
script to make the image).

Live Build was starting to reach the maximum complexity of its design.

Live Build had a lot of case statements scattered throughout the code.
If the image type is iso, then we'll go through all the boot loader
options.  If the ibmage type isn't iso, we'll do this other thing.  This
meant that adding a new option involved going and making sure you'd
added all the case statements to deal with all the possibilities.

Live Build didn't have enough of a tesnt suite.  It wasn't really
reliable.  Upgrading was always filled with fear and uncertainty.

Live Build is not currently maintained.

That said, if I had an existing Live Build project, I would not migrate
away from Live Build today.  I'd expect to try and keep it limping along
through Stretch, fighting anyone who tries to remove Live Build,
possibly throwing in some Live Build bug fixing along the way.  I'd plan
for probable migration in the Buster time frame though.

                             vmdebootstrap

vmdebootstrap is relatively limited in what it does, but IT WORKS.  The
first time I ran it, it produced a working image.  That's very unusual
in my experience of image creation tools.


vmdebootstrap is relatively focused in its image creation.  It will loop
mount a msdos-partitioned image that it creates.  It will run
debootstrap with one filesystem and an optional boot partition or UEFI
ESP partition.  It will install a boot loader.

It has a few custom hacks, mostly only for older releases.  Fore example
you can request serial console, but not for stretch.  You can request
dhcp, although be warned that with stretch you'll get systemd-networkd,
which may surprise you.

All configuration is on the command line.

You can run one customization script during image creation.

There are some huge strengths.  Development is active.  I think all but
one of the bugs I reported was fixed before writing this comparison.
It works.  There is an active culture of regression testing and
continuous integration.

It supports UEFI.  It only does DOS partition tables not GPT , which is
kind of surprising, but support for that is mandated by the UEFI spec.
I've had some annoyance with real hardware having to reconfigure the
BIOS to prefer UEFI boot because of this decision, but it does work.
However, there are some huge defects.

Extensibility is lacking.  
It, like bootstrap-vz, is written in python2.  Another problem makes
that less of an issue than it is for bootstrap-vz: vmdebootstrap is not
particularly extensible procedural code.  There are modules, but there's
not a good object hierarchy, and there are neither particularly reusable
components (outside of some libraries for running commands and the like)
nor are there  points at which you could integrate plugins.  So, I'm
much less likely to want to use my own Python libraries in vmdebootstrap
than I am with bootstrap-vz.


You get that one customization point.  If it's at the wrong place for
your needs, guess you'll have to deal somewhere else.  It turns out that
I needed to change the image after update-grub had been run for the last
time.  I want to make sure that each of my installed machines ends up
with a different BTRFS filesystem uuid for the root.  If I ever want to
take a snapshot of one machine and manipulate it on another, that
becomes important.  So, I have written an initramfs-hook to change the
uuid on the root (you can't do that to a mounted btrfs) and fixes up
/etc/fstab.  I trigger this by a root command line of the form
root=REUUID= instead of root=UUID=.  So, I want to substitute UUID= with
REUUID= in the grub configuration for the first boot.  I definitely do
not want that change to survive a run of update-grub (which I'll arrange
for on first boot).

So, I ended up  writing Python code to mount the image, bypassing
vmdebootstrap's customization entirely.  I con-considered borrowing code
From filesystem.py.  However, that code depends on the global settings
dictionary that is maintained by the command line script.  Those bits
weren't really reusable.  (And also vmdebootstrap seems to leak loop
devices; I had to write wrappers to clean that up)

It's telling that live-wrapper, even though it is also Python, doesn't
use vmdebootstrap as a library.  It calls the command line script.
There's no other way you could do it: vmdebootstrap's extensibility
profile isn't good enough.

If you want LVM, too bad.  If you want GPT, too bad.  If you want your
root filesystem labeled, better do that yourself.

Squashfs generation is supported, although you are responsible for
setting up live-boot and live-config.  You may have a bit of fun
integrating that with vmdebootstrap's bootloader logic.  (You might
think that live-wrapper would at least do that for you.  Not yet.  I'm
actually totally fine with vmdebootstrap's decision to push configuring
live-* up a layer, although I think it needs better customization hooks
to make that work well.)

vmdebootstrap doesn't have support for adding keys to Apt's trusted key
store.  It doesn't have support for  adding additional archives to your
sources.  Debootstrap requires that all the packages come from one
mirror.  vmdebootstrap doesn't provide a way to install packages from
another mirror later.  Of course you can work around all that—in the
customization hook you're provided even.  However, that's all critical
functionality you expect from a custom image tool.  Some layer needs to
provide that for me if it's really going to be a complete solution for
managing custom images.

SOURCE matters.  If I'm building an image that I plan to distribute to
people, GPL and other licenses require that I distribute source.  “But
Sam, you have the Debian mirror you used.”  And so I do, until it
changes.  I tend to distribute images longer than mirrors stay stable.
Yes, I could use aptly to snapshot my entire mirror.  Yes, I could
coordinate s,m.something to figure out what packages make their way into
the image and snapshot only those package sources.  Live Build gives me
this, and at least for custom images, that too is a requirement.

Ultimately, I did produce a reliable tool that calls vmdebootstrap under
                              the covers and then does a bunch of
                              customizations.  Am I glad I used
                              vmdebootstrap rather than fighting with
                              Live Build?  Eh.  I learned something,
                              and I'm not really unhappy with my
                              choices.  But no, vmdebootstrap is not a
                              clear win today.  Tomorrow??  Now, that is
                              the question.


A Suggestion

As a user, who would love to see Debian be great, but who has no
political skin in any previous battles, I'd like to offer a suggestion
for something to explore.

I'd like to see some folks explore bootstrap-vz layered on top of
vmdebootstrap.  Debootstrap would be responsible for taking Debian
packages and unpacking and doing initial configuration, just as it
always does.  vmdebootstrap would be responsible for partitioning,
filesystem creation, bootloader installation and running debootstrap.
Bootstrap-vz would be responsible for tweaking the image, for providing
a framework for those tweaks, and for the schema/manifest language to
describe what it is that we want in an image.

It would require compromises and changes all around.

Do EC2 eimages really need to be built in an EBS volume?  Would it be
good enough to copy them there after, building in a loop file like
everything else until that moment?

vmdebootstrap would need to gain an API and an object model.

We'd probably have to give up some of the tweaks we have, and add
support either for plugins for some of the more basic tweaks directly
into vmdebootstrap.  As an example, vmdebootstrap would almost certainly
need to support raw images without a partition table.
However, I think an clear interface between vmdebootstrap and
bootstrap-vz would make it easier to audit what was being done to an
image and might improve both products.

I think we should do something.  If there's one thing I hope you take
away from this message, it's that none of the options I explored is
ready—probably not even for Stretch.  Will we have a regression in our
custom image functionality Stretch over Jessie?  Will we have a solution
in time for Stretch?  I don't know the answers to either of those, but I
hope that we can come together in a spirit of cooperation and work on
those questions rather than focusing on parallel, independent paths.

Attachment: signature.asc
Description: PGP signature


Reply to: