Debconf and moving beyond single-package configuration
Recently there have been several discussions about Debconf. Joey Hess
started a thread on the abuses of Debconf, where we discussed ways of
preserving administrator changes and ways to make configuration files
easier for machines to parse. A more recent thread focuses on the
interactions between Debconf and init.d scripts. However, most of
these discussions have focused on how we can use Debconf to do what
we've always done, avoiding common problems with a new technology.
Debconf is not the only configuration related discussion
brewing within Debian.
In December we took a look at what was then the latest Mandrake
that discussion Mandrake does a better job handling task-based
configuration. Instead of choosing to install a set of packages, the
user says that they want to accomplish some task, and the appropriate
software to perform that task is installed and configured. For
example, it's fairly easy under Mandrake to set up a box to get a DHCP
address and then to NAT for other interfaces. Setting up these types
of configurations--configurations that cross multiple packages--is
harder under Debian in many cases. It turns out that in the case of
NAT, the ipmasq package provides a simple solution. However, in the
general case, you can make simplifying assumptions when you know the
task that can simplify the configuration of the individual packages.
The discussion also explored how systems like webmin and linuxconf
might help solve these configuration issues. While I'm unconvinced
that either of these products really helps the general problem, they
do help some specific cases.
Before we actually consider Debconf, let's look at an example that
already works--ipmasq. Ipmasq does some fairly complex configuration
of ipchains rules. It works mostly because netbase does not actually
contain any logic for configuration of ipchains, so there is no
conflict. For the most part, the package represents a particular set
of configuration choices; by installing the package you say you want
to set up IP masquerading. As such the package can make some
configuration assumptions that a normal netbase package cannot make.
The idea of using packages to make assertions about assumptions about
how a machine will be used is not new. Task packages do this; by
installing a task package, we can assume that a machine needs enough
software to perform some task. Outside of Debian proper, this is also
fairly common. Most of the companies I've worked at have had one or
more task packages for their workstations. These packages tended to
also make assertions about the configuration of the workstation, often
bringing in site-specific configuration files.
I'm working for a company developing an instant infrastructure
prototype based off Debian (http://www.boxedpenguin.com). I learned
much from this attempt. First, allowing configuration to cross
package boundaries yields significant benefits from a user experience
and configuration simplicity standpoint. In my case, even with Debian
packages that had reasonable Debconf support it took around 30 minutes
to set up all the details and interactions to get the infrastructure
prototype working by hand. This did involve making a few mistakes,
and I probably could have gotten it down to 10 minutes were that all I
ever did. However, it's roughly two minutes to install and configure
given simplifying assumptions and fully automated configuration.
So, once we decide that the problem is worth solving, we need to
actually decide how we'll solve it. I noticed that Debconf was doing
a good job of actually encapsulating what configuration information
each package needed. I decided to see if there was a way I could
encode the assumptions I was making about the configuration of the
system into manipulation of this information.
I made a package called boxedp-assumptions that my task packages
pre-depend on. This package includes copies of all the Debconf
templates for the packages I want to configure. The config script for
boxedp-assumptions sets a bunch of Debconf values and sets their seen
flag to false.
This actually works fairly well in practice, even though it is a
disgusting hack. The user gets asked few questions, and the software
is configured. It isn't clear how to handle packages that the user
had already installed that boxedp-assumptions wants to manipulate
configuration information for. I suspect if I decided at a design
level how to handle this situation, implementing it would be fairly
So, I'd like to look at how we can solve this problem correctly. I
guess we should start by asking whether my formulation of the problem
is correct. Is it reasonable to represent tasks the user wants to
configure as packages and to think of them in terms of the assumptions
you can make about configuration of the component packages?
One alternative would be to view global configuration issues as
packages but rather than view it as a set of assumptions set up
dependencies and shared templates between the packages involved. We
could require that the packages cooperate and build knowledge about
the configuration tasks we'd like to support into the packages. This
might be easier from an architectural standpoint. However, new
configurations of software in Debian will likely arise fairly often
and localizing knowledge about a configuration into one package
representing that configuration task rather than spreading it
throughout the packages needed to accomplish that task
seems desirable. Also, avoiding changing the packages to support new
global configuration tasks makes it easier for sites or products to
add their own global configuration tasks that are not strictly part of
If this formulation seems reasonable, we should then look at how to
architect a system for expressing assumptions. I'm not really sure
how to do that, so I'll start by explaining why I think my solution
is not a correct long-term solution.
One problem is that there is very little the config script can depend
on at preconfiguration time. For example, you can't depend on any
sets of libraries that might be useful in your config script. This
issue also effects those who would like to write general libraries to
parse configuration files (see the abuse of Debconf thread). You
also can't depend on any particular version of Debconf or on features
that may have been introduced at some point.
Another problem was that I needed templates for the packages I was
configuring. This involves depending on particular versions of the
Another potential problem is manipulating another package's Debconf
database at all. The templates that a particular package uses are
internal, may change between versions and are not normally a reliable external
interface. However, they work; Debconf provides in most cases I've
considered the configuration information that you'd actually like to
change to compose some global configuration out of changes to specific
package configuration. This is true even for most packages that did
not contemplate the global configuration that is being performed.
There are also fairly good practical reasons to minimize changes to a
package's Debconf usage. So, while it is not guaranteed to be stable,
Debconf provides opportunities and practical advantages that I don't
think we will find anywhere else.
I have given a bit of thought to how a better solution might be
constructed. I've considered writing a customized Debconf frontend.
That would avoid depending so much on running at preconfiguration time
and would avoid using the templates. Getting the frontend called
might be difficult. I've also looked at some sort of structure for
calling apt-get to install//configure the packages you need rather
than depending on them. This again provides better control of the
environment at configuration time. However, none of the solutions
have been fleshed out enough to implement. Your thoughts are welcome.