Please note the follow-up to the cloud list. Hi. I recently finished putting together a custom image of what amounts to stretch for work. Today, this is mostly not a cloud image, although that's expected to change for our future customer deployments. For my previous job I did maintain a cloud image using live-build through the wheezy and jessie cycles. In this discussion I'm focusing on the needs of custom images. We've all agreed that a robust and flexible approach for custom images is an important goal of our cloud (and presumably live) efforts. My personal belief is that we should use the same tools for official images as we recommend for custom images. Back in the days when I cared about sets of CDs I always found it frustrating that the debian-cd packages could make official images, but you were better off with other approaches for custom cd sets. There would be unfortunate and undesired differences introduced because you weren't using debian-cd, but it was never worth the pain of hacking debian-cd for custom sets. As a user, I'd like to contrast this significantly with live images, where we have (so far) used mostly the same tooling for official and unofficial images. It feels more polished and the approach is better. I'd like to get a couple of things out of the way first. live-wrapper I think live-wrapper is too immature to evaluate. I tried to evaluate it as a replacement for live-build, but quickly concluded that it's too early in the live-wrapper development cycle for me to be able to look at. The documentation of live-wrapper 0.3, which I looked at, was very sparse. It requires a lot of the directory structure where you run it without clearly documenting that. There's some sort of customization script that is needed, but it's unclear how that works. Live-wrapper is missing functionality that was critical to me in my use of live-build. It does not support live-installer. It does not support including d-i and a live system on the same CD. It didn't appear to support easy customization of the bootloader configuration. I think it's great that people are looking at something built on top of vmdebootstrap. I hope that effort continues, and when it's more mature I'd be happy to take another look. I'd advise taking a careful look at my comments on vmdebootstrap, as I think several of them may have implications for live-wrapper. bootstrap-vz Bootstrap-vz is a collection of python for building cloud and VM images. I read all the documentation, examined some chunks of the code, but did not end up using the product to build an image. Up front I'll say I was disappointed that the code is python2.7 not python3. It actually does matter. As I build a wider collection of idiomatic python3 within my own domain, it would be valuable to be able to use that while customizing the images I develop. Making it impossible for me to use my code and the bootstrap-vz libraries and plugins in the same process is architecturally disappointing. The strength of bootstrap-vz is that it has a plugin architecture, and supports tweaking aspects of the image for the environment. There's an assumption that things will be different between ec2, gce, virtualbox, kvm, etc. You have to pick one of these at a fairly basic level, and it significantly affects the code path that will be used. There are a bunch of plugins that accomplish things. The code supports images with and without partition tables. It supports images backed by EBS volumes, raw devices, and even things like vdi for VirtualBox. Different virtualization layers can plug in plugins. For example the kvm layer plugs in support for virtio. This is both a blessing and a curse. Virtio is a great example: VirtualBox supports it too, and it's unclear why that plugin would be introduced at that level in the code. Bootstrap-vz failed to meet my needs in two important ways. First, I found it too difficult to audit how an image I would create with bootstrap-vz would iffer from what I'd get debootstrapping something by hand. That is, I found that I could not trust the changes bootstrap-vz might introduce for me. I'd end up needing to read most of the code to evaluate this, because it seems like a major goal of the design is to make those sorts of tweaks easy. Secondly, with some irony, I realized that it would be annoying for me to add my own customizations to the image. Bootstrap-vz makes it easy for bootstrap-vz developers to add new tweaks, but makes it challenging for custom image builders who don't want to hack on the bootstrap-vz sources. I really want something like the vmdebootstrap customization script where I can run a chunk of my shell or Python code at appropriate points in the process. Secondly, even if I did want to write vmdeboothstrap-style tasks and plugins, it looks like getting them integrated might be a bit tricky. Bootstrap-vz uses a lot of plugins, but the plugin consumers seem to know way too much about the plugins they consume. Take a look at bootstrapvz/providers/kvm/__init__.py to see what I'm talking about. That file knows about the kvm-specific tasks; it integrates them into the flow, knows about their configuration, etc. Instead I'd hoped for an architecture where the consumer might sometimes be responsible for loading the plugin, but the integration of the plugin into the process would be accomplished by the plugin registering itself. That is, in the mentioned __init__.py file, the import statements seem fine, but the parent module knowing so much about the child module seems undesirable in a plugin design. While I did not use Bootstrap-vz, I want to call out a bunch of excellent properties. It does have a plugin architecture. It demonstrates thoughtful enough design that you can integrate new programatic bits into a structure, sharing code, but using a somewhat declarative approach rather than a purely procedural approach. That is, the objects and plugins seem used to relatively good effect. While i complained about lack of extensibility for out-of-tree users, extensibility for in-tree developers seems to be a huge strength. Also, the flexibility is a huge strength. To the extent we need to tweak things based on environment, provider and the like, bootstrap-vz is good. Finally, in terms of the resulting image, the manifest is purely declarative. I chafe at that building custom images, but for official images that's probably a huge strength. Live Build Live Build is the most comprehensive image building tool that I've used. It has very good documentation compared to the other tools. It supports a rich declarative format: I can add additional archives, keys, packages and the like. It also supports hooks so I can make customizations as needed. It has caching infrastructure. I found that performance with live-build was much better than vmdebootstrap. Live Build has critical functionality that is missing from other solutions. I talked about the aspects of that functionality that relate to live systems under live-wrapper, the only other tool I looked at that tries to handle live systems directly. I'll talk about the other missing functionality under vmdebootstrap. Unfortunately, Live Build has some significant disadvantages. It's written in shell. It is a triumph of modular shell programming. I found that given time, I was able to understand an debug it: given how complex the shell scripts are, that says something very positive. I found that even when some of what it wanted to do wasn't right for my needs, code reuse was possible. Between the phase commands, function libraries and similar, lots of thought was put into modularity. However, triumph of modular shell programming though it is, it IS ALL SHELL. There's only so modular a shell script can be. There weren't really objects—and while yes, I know that you can do object-oriented shell, it doesn't make things better. Live Build doesn't do a good job at things that aren't live systems. You can do some things by selecting plain as the root filesystem, and hdd as the image type, but I seem to recall that doesn't end up working. I ended up using only part of the typical chain (lb bootstrap&&lb chroot rather than lb build, then writing my own short script to make the image). Live Build was starting to reach the maximum complexity of its design. Live Build had a lot of case statements scattered throughout the code. If the image type is iso, then we'll go through all the boot loader options. If the ibmage type isn't iso, we'll do this other thing. This meant that adding a new option involved going and making sure you'd added all the case statements to deal with all the possibilities. Live Build didn't have enough of a tesnt suite. It wasn't really reliable. Upgrading was always filled with fear and uncertainty. Live Build is not currently maintained. That said, if I had an existing Live Build project, I would not migrate away from Live Build today. I'd expect to try and keep it limping along through Stretch, fighting anyone who tries to remove Live Build, possibly throwing in some Live Build bug fixing along the way. I'd plan for probable migration in the Buster time frame though. vmdebootstrap vmdebootstrap is relatively limited in what it does, but IT WORKS. The first time I ran it, it produced a working image. That's very unusual in my experience of image creation tools. vmdebootstrap is relatively focused in its image creation. It will loop mount a msdos-partitioned image that it creates. It will run debootstrap with one filesystem and an optional boot partition or UEFI ESP partition. It will install a boot loader. It has a few custom hacks, mostly only for older releases. Fore example you can request serial console, but not for stretch. You can request dhcp, although be warned that with stretch you'll get systemd-networkd, which may surprise you. All configuration is on the command line. You can run one customization script during image creation. There are some huge strengths. Development is active. I think all but one of the bugs I reported was fixed before writing this comparison. It works. There is an active culture of regression testing and continuous integration. It supports UEFI. It only does DOS partition tables not GPT , which is kind of surprising, but support for that is mandated by the UEFI spec. I've had some annoyance with real hardware having to reconfigure the BIOS to prefer UEFI boot because of this decision, but it does work. However, there are some huge defects. Extensibility is lacking. It, like bootstrap-vz, is written in python2. Another problem makes that less of an issue than it is for bootstrap-vz: vmdebootstrap is not particularly extensible procedural code. There are modules, but there's not a good object hierarchy, and there are neither particularly reusable components (outside of some libraries for running commands and the like) nor are there points at which you could integrate plugins. So, I'm much less likely to want to use my own Python libraries in vmdebootstrap than I am with bootstrap-vz. You get that one customization point. If it's at the wrong place for your needs, guess you'll have to deal somewhere else. It turns out that I needed to change the image after update-grub had been run for the last time. I want to make sure that each of my installed machines ends up with a different BTRFS filesystem uuid for the root. If I ever want to take a snapshot of one machine and manipulate it on another, that becomes important. So, I have written an initramfs-hook to change the uuid on the root (you can't do that to a mounted btrfs) and fixes up /etc/fstab. I trigger this by a root command line of the form root=REUUID= instead of root=UUID=. So, I want to substitute UUID= with REUUID= in the grub configuration for the first boot. I definitely do not want that change to survive a run of update-grub (which I'll arrange for on first boot). So, I ended up writing Python code to mount the image, bypassing vmdebootstrap's customization entirely. I con-considered borrowing code From filesystem.py. However, that code depends on the global settings dictionary that is maintained by the command line script. Those bits weren't really reusable. (And also vmdebootstrap seems to leak loop devices; I had to write wrappers to clean that up) It's telling that live-wrapper, even though it is also Python, doesn't use vmdebootstrap as a library. It calls the command line script. There's no other way you could do it: vmdebootstrap's extensibility profile isn't good enough. If you want LVM, too bad. If you want GPT, too bad. If you want your root filesystem labeled, better do that yourself. Squashfs generation is supported, although you are responsible for setting up live-boot and live-config. You may have a bit of fun integrating that with vmdebootstrap's bootloader logic. (You might think that live-wrapper would at least do that for you. Not yet. I'm actually totally fine with vmdebootstrap's decision to push configuring live-* up a layer, although I think it needs better customization hooks to make that work well.) vmdebootstrap doesn't have support for adding keys to Apt's trusted key store. It doesn't have support for adding additional archives to your sources. Debootstrap requires that all the packages come from one mirror. vmdebootstrap doesn't provide a way to install packages from another mirror later. Of course you can work around all that—in the customization hook you're provided even. However, that's all critical functionality you expect from a custom image tool. Some layer needs to provide that for me if it's really going to be a complete solution for managing custom images. SOURCE matters. If I'm building an image that I plan to distribute to people, GPL and other licenses require that I distribute source. “But Sam, you have the Debian mirror you used.” And so I do, until it changes. I tend to distribute images longer than mirrors stay stable. Yes, I could use aptly to snapshot my entire mirror. Yes, I could coordinate s,m.something to figure out what packages make their way into the image and snapshot only those package sources. Live Build gives me this, and at least for custom images, that too is a requirement. Ultimately, I did produce a reliable tool that calls vmdebootstrap under the covers and then does a bunch of customizations. Am I glad I used vmdebootstrap rather than fighting with Live Build? Eh. I learned something, and I'm not really unhappy with my choices. But no, vmdebootstrap is not a clear win today. Tomorrow?? Now, that is the question. A Suggestion As a user, who would love to see Debian be great, but who has no political skin in any previous battles, I'd like to offer a suggestion for something to explore. I'd like to see some folks explore bootstrap-vz layered on top of vmdebootstrap. Debootstrap would be responsible for taking Debian packages and unpacking and doing initial configuration, just as it always does. vmdebootstrap would be responsible for partitioning, filesystem creation, bootloader installation and running debootstrap. Bootstrap-vz would be responsible for tweaking the image, for providing a framework for those tweaks, and for the schema/manifest language to describe what it is that we want in an image. It would require compromises and changes all around. Do EC2 eimages really need to be built in an EBS volume? Would it be good enough to copy them there after, building in a loop file like everything else until that moment? vmdebootstrap would need to gain an API and an object model. We'd probably have to give up some of the tweaks we have, and add support either for plugins for some of the more basic tweaks directly into vmdebootstrap. As an example, vmdebootstrap would almost certainly need to support raw images without a partition table. However, I think an clear interface between vmdebootstrap and bootstrap-vz would make it easier to audit what was being done to an image and might improve both products. I think we should do something. If there's one thing I hope you take away from this message, it's that none of the options I explored is ready—probably not even for Stretch. Will we have a regression in our custom image functionality Stretch over Jessie? Will we have a solution in time for Stretch? I don't know the answers to either of those, but I hope that we can come together in a spirit of cooperation and work on those questions rather than focusing on parallel, independent paths.
Attachment:
signature.asc
Description: PGP signature