[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Comments on live-build, vmdebootstrap, bootstrap-vz, and live-wrapper



>>>>> "Neil" == Neil Williams <codehelp@debian.org> writes:

    Neil> That's the key point for me - grub sorts itself out *inside*
    Neil> the build environment. It is critical for reproducibility that
    Neil> the tools used to build the image come from inside the
    Neil> image. We need to be able to build the same image on stable as
    Neil> on testing. The only tools scripts like vmdebootstrap should
    Neil> need to use from the host system are those to create the image
    Neil> file, partitions and filesystems. Grub meets this requirement
    Neil> well, UBoot can too (within some limits), other bootloaders
    Neil> commonly fail or have stupid assumptions about special
    Neil> handling of partitions and/or filesystems. Bootloaders do not
    Neil> *need* to be special in terms of installation into an image
    Neil> and it's way beyond time for this to be fixed properly -
    Neil> within the packages which provide those bootloaders in the
    Neil> archive *NOT the build tools*!.

We're in agreement that the boot loader should be installed from within
the  image for several image types.

The boot loader configuration is properly something that the image
builder needs to deal with, and vmdebootstrap isn't currently very good
at this:

Serial console support for the boot loader is the most obvious thing
that affects non-live images running on real hardware.  Actually, I
think one of my customers may end up caring about a splash screen or
banner at the boot loader before too long.


For VM images, a very supported configuration in some environments is
for the kernel and initrd (if there is one) not to live inside the image
at all.
There are some real security advantages to this approach.

For paravirtualized images, the boot loader may or may not live inside
the image.

For live images, the boot loader probably doesn't live inside the
squashfs, and you probably need nontrivial configuration of the
bootloader.




    Neil> qemu-img|dd, a bin-format handler, parted, debootstrap & the
    Neil> mount.*, mkfs.* pairs are all that should be allowed to write
    Neil> to the image from the host system. *Everything* else can and
    Neil> should be done only under chroot.

    Neil> The goal should be that installing a package and possibly
    Neil> calling a tool within that package must be all that is
    Neil> required to fully install a working bootloader into an image,
    Neil> just like any other package.
Packages sometimes also require configuration if you don't want their
    Neil> default options.
    

    >> > We'd probably have to give up some of the tweaks we have, and
    >> add > support either for plugins for some of the more basic
    >> tweaks > directly into vmdebootstrap.  As an example,
    >> vmdebootstrap would > almost certainly need to support raw images
    >> without a partition > table.

    Neil> I don't see what benefit that provides.

for EC2 and for VMs backed by LVM, the sysadmin experience is
significantly easier if you don't have a partition table.

    >> However, I think an clear interface between vmdebootstrap and >
    >> bootstrap-vz would make it easier to audit what was being done to
    >> an > image and might improve both products.
    >> 
    >> I think we should look into pushing customizations and tweaks
    >> into packages. Say ec2-defaults, virtualbox-defaults etc. The key
    >> benefit being that everyone gets same settings regardless of what
    >> tool they used to create the image, and everyone gets new better
    >> settings with apt-get upgrade ( rather than via switching to a
    >> new image).

    Neil> Exactly, push the special snowflake knowledge into the
    Neil> packages to be installed inside the image, not into the build
    Neil> tool.

I'm fairly frustrated and disappointed reading the above.
I would have hoped that you would work to understand others positions
before jumping to conclusions.
We are not going to make forward progress building an operating system
that is best for our users if we don't take time to understand what we
are saying and understand the needs throughout our community.

When I look at the bootstrap-vz tasks, they tend to fall into three
categories: base image manipulation, package installation, and
configuration.

The base image manipulation tasks are things like "I want losetup not a
raw device" or "run debootstrap."

Many of the plugins just select additional packages to install.
Vmdebootstrap is not good at managing this.  Even for a fairly simple
custom image, I was wishing for more infrastructure in this regard,
simply to keep track of what packages the image needed beyond the base
system (and to deal with limitations in the debootstrap resolver).
I ended up creating what would have been three package-installation
related tasks in bootstrap-vz for my case.
So, even for the tasks that can be done by installing a package,I think
we need more work than vmdebootstrap gives us today.

Then we get to configuration.
First, our policy forbids one package from mucking with another
package's configuration files for good reason.  That is left to the
sysadmin.  In some cases the installer/image builder fills in for the
sysadmin.  There are a number of cases where d-i will configure a
package.  The same is true for bootstrap-vz.  The same is true for
vmdebootstrap customization scripts--even the ones distributed as
examples.  I'm saying that APIs for managing this and for allowing a
collection of configurations to be specified by something like
bootstrap-vz or live-wrapper above vmdebootstrap would significantly
benefit the product.

I am not convinced you can create a policy-compliant ec2-defaults
package.  Even if ec2-defaults could be handled that way, there are
other things that could not be.

One of the most obvious is cloud-init.  You absolutely want to be able
to stuff cloud-init configuration into an image you build, and as
someone customizing for a particular cloud environment, or a sysadmin
customizing for a particular organization, the interface bootstrap-vz
gives for specifying this in a manifest is superior to hacking this into
a vmdebootstrap customization script.

When we get to custom images rather than official images, the need for
tweaks becomes more important.  More infrastructure than one shell
script is essential.  I went through the infrastructure that live-build
provides that I found useful.  I feel frustrated that you simply
dismissed my my needs and said "package that."

As a Debian Developer, I actually could go package a bunch of that
stuff.  We actually do have packages in our organization to manage our
apt sources and archive keys.  I assure you that is more trouble than
it's worth if you have fewer machines than we do, and that sort of
customization is important to provide users the tools they need to build
custom images.  Asking users to go package that is not reasonable.

For the image I just produced with vmdebootstrap, I did actually
generate two new packages to package up our initramfs hook, and a couple
of scripts.  However, I was left with the following tasks that are
properly configuration:

* mark the image as never having been booted.  Package installation
  should not by itself trigger resizing partitions and changing the
  filesystem UUID on next boot; that properly needs a configuration step
  to enable.

* Copy in ssh authorized keys.  Packages should not install files into
  user homedirectories.  Also, packaging the initial set of authorized
  keys is not worth a new package.


* Copy in a new /etc/default/grub to turn on serial console.  A plugin
  that just did the sed substitution would be better, but I was lazy and
  replaced the file

* Set the timezone.  Again, not proper for a package.

* Configure what locales to generate and call locale-gen.  Not proper
  for another package to modify locales's configuration files.

* Hack the generated grub.cfg to replace UUID= with REUUID= to trigger
  our initramfs-hook to change the UUID of the filesystem on next boot
  (and then regenerate grub.cfg with the new uuid).  I guess this one
  could have been a script to run in the package containing our
  initramfs-hook, but then I'd need a customization task to run that
  script.


* Run apt-get clean

* Add a user.  I suppose I could have used the vmdebootstrap command
  line option for this one, but only if I wanted a single user.  If I
  needed more than one user, I'd want customization tasks.  O, actually
  looking at vmdebootstrap --help, I could not use the command line
  option because I need to put the user in supplemental groups.
  

* Override vmdebootstrap's handling of sources.list.  I want a
  sources.list.d file with a specific name so it will get overwritten by
  our sources.list package when that gets installed by ansible later.  I
  want nothing in sources.list itself.

* Disable passwordauthentication for sshd

for one of our images, I also

* Copy in another image.  (We have one image used to install real
  hardware which gets the target image added to it)

* Muck with btrfs parameters of the image a bit for internal reasons
* Overwrite the hostname (could have been done with vmdebootstrap
* command line)

* Install packages that wouldn't be handled well by debootstrap


my conclusions from the above is that even if you believe in packaging
what can be packaged, there are configuration tasks that remain.  In
addition, you very quickly get to a point where better abstraction than
run-this-shell-script would be helpful.

I actually think an architecture where something looking a lot like
bootstrap-vz could take a manifest containing operations like "Cloud
init should gain the following yaml", and translate that into an
operation like "copy this file into the image."  I think having a small
set of valid operations, managed by an API between a layer looking a lot
live vmdebootstrap would improve auditability.  If you had few
operations beyond add package to install at debootstrap time, copy file,
run command in image, add package to install later, and possibly a few
others, I think you could better balance code reuse for customizations,
auditability, and tools with a simple well-crafted purpose.

I forget who said that they didn't see a need for extra layering.  If
you buy the bootstrap-vz architecture as it is today, then perhaps not.
If you think something at the lowest layer looking something like
vmdebootstrap is a good thing, I do think a layer above that translating
operations meaningful to a sysadmin into tasks like copy this file is
valuable.

It looks like I'm about to need to refactor what I have and produce
images that can encrypt themselves as soon as they are given a key
(probably on second boot).  Ah for customers with requirements.  Clearly
the encryption itself cannot be done at image generation time else every
install would end up with the same key.  However, it seems like this
will have impacts on the image creation process, and I'm not at all sure
that my approach of running vmdebootstrap followed by some
customizations after will end up continuing to be viable given the new requirements.


Reply to: