Configuration management, version 5

To: debian-devel@lists.debian.org, debian-policy@lists.debian.org
Subject: Configuration management, version 5
From: Wichert Akkerman <wakkerma@wiggy.ml.org>
Date: Sun, 2 Aug 1998 01:03:52 +0200
Message-id: <[🔎] 19980802010352.A25071@wiggy.ml.org>

                           Configuration Management
                                 revision 5

While sailing (mostly motoring actually, darned wind let us down again) across
the channel I have been looking through my maillog and have revised the design
again. First, a couple of remarks about the changes:

* seperation of variables and the script is indeed a good concept, so I added
  that to the design.
* all proposals for a file with the variable list share 1 thing: the priority
  of a variable is given in this list. I don't think that is reasonable though:
  we can't always know if a variable is important until we know the conditions.
  For example: if you configure sendmail as a nullclient, it is critical to
  know the smarthost. In all other situations knowning the smarthost is not
  important at all.
* This gives us two options: either add some algebra to the variable list
  to calculate the important from other variables, or let the script do it.
  I think the last choice is better.
* Getting data from multiple databases would be great. It occured to me
  we already do this somewhere else: the service switch in glibc2 is a
  good example of how to do this. We can expand that approach and use it
  here.
* This text is based on the changed version Joey Hess posted.
                         Managing configuration data
                         ===========================

1. The configuration space All configuration information is stored in what I
call the configuration space. This is a database with a special design which
resembles to method we look at configuration information. This is done by
defining a hierarchy of information. Each package receives its own space
in the hierarchy.  Each package is free to use a flat space, or divide it's
space further into subhierarchies.  If multiple packages share a common purpose
they may use a shared toplevel hierarchy, preferably with the same name as a
shared (virtual) packagename (for example, both mutt and elm can use
mail-reader, strn an nn could use news-reader).  This shared tree can also be
used as a default, ie a variable news-reader/nntpserver can be used by strn if
strn/nntpserver does not exist.

2. Types of variables
Multiple types of variables can be stored in the configuration space. A
preliminary list of types is: strings, numbers, lists, hostnames, IP
addresses. Each variable can have meta-data associated with it for special
purposes. The minimum meta-data associated with a variable is: long and short
description, type, default value and an `isdefault' flag.  The `isdefault' flag
states if a variable has been changed from it's default value. This can be used
when upgrading a package to check if the user has changed the default, or it's
safe to change it to a new default. This gives us the same result as the
md5sum-checking dpkg does for conffiles, but on a much finer-grained level (per
variable instead of per file).

3. Accessing the configuration
The configuration may be spread across multiple databases. We use a
virtual database to represent these databases as one big database. We use
the same method as the service switch in glibc2: we have a file in which
we define the databases and their entry-points, and the search order we
use to access the databases. An example:

-----------------------------------------------------
# First define all databases we use
database "companycfg" {              # Company-wide database
    driver    "Oracle";              # Use an oracle-database
    instance  "Config";              # Oracle-specific parameters
    user      "config"
    password  "config";
    root      "/common/$(ARCH)/";    # Starting-point for database access
};

database "departmentcfg" {           # Department-wide database
    driver    "LDAP";                # Use a LDP database
    root    "/config/$(ARCH);        # Starting-point for database access
};

database "hostspecs" {               # Database for host-specific config
    driver "DHCP";                   # Use DHCP-minidriver
};

database "local" {                   # Database to store local configuration
    driver    "DebianDB";            # Use our own database
    file    "/var/state/config/database";
};

# Define the search-orders:
order {
    retrieve { local, hostspecs, departmentcfg, companycfg };
# Only store data in the local database
    store { local };
};

-----------------------------------------------------

(This format is heavily based on the bind named.conf format: it's quite
easy to parse and very flexible.) . We start by defining the databases
to use. Each database has a driver to use and a root from which we start
looking. Variables such as $(HOSTNAME) and $(ARCH) can be used here. Each
driver may also add other variables, like instance, user and password for
the Oracle-example given here. Some database may have mini-drivers, like
the DHCP-database: they can define only a couple of variables like IP-address,
hostname, etc. and return a not-found error on all other requests.


                          Install-time configuration
                          ==========================

We want to make a package which does not break older dpkg's, and we want to be
able to get the configuration information before the package is unpacked. To do
this we add a new file, config.tar.gz, to the package besides the current
control.tar.gz and data.tar.gz . Since all installation-software (apt, dselect,
dpkg) download the package before installing it, we can extract this before the
package is unpacked.  Since older dpkg's will not process the extra file, we
can do to things: either create an extra assertion "--assert-configmodule" in
dpkg which is checked in the preinst, or up the versionnumber of the package.

1. The variable definition file
This file, named `variables', is contained inside the config.tar.gz . It
is a simple text file that defines each variables used by the package's
configuration module (see below). This information is merged into the
configuration sapce when a package is installed or upgraded, and may be
removed when the package is purged.
The format of this file parallels a Package file and is as follows:

  Variable: <variable name>
  Type: <variable type>
  Default: <default value>
  Description: <short description>
   <long description of variable>

  Variable: ...

2. The configmodule do
The configuration module is a executable named `config' in the config.tar.gz
file.  The configmodule is the part of a package that will determine the
configuration before the package is installed. This means it is run _before_
the preinst, and before the package is unpacked! Unless pre-depends are
used, this will mean that the module can only assume the base-system is
installed.

3. How does the configmodule get it's information?
The configmodule needs a way to retrieve information from the configuration
space, ask the user for information if necessary, etc. But we don't want
to implement a user interface for each package. To solve this we use a
seperate frontend, which provides the configmodule with a method to
access the configuration space and interact with the user.

4. How do the configmodule and the frontend interact?
Of course the configmodule and the frontend must exchange data to do their
work. We do this in a very simple manor: dpkg starts both the configmodule
and the frontend, and connect the stdin/stdout from the module to the
stdout/stdin of the frontend. We can then use stdin/stdout to communicate,
while still having stderr available to report errors.

5. The frontend
There are two types of frontends possible: interactive and non-interactive.
Interactive frontends allow the user to answers questions and see messages.
Non-interactive frontends get all information from a database (SQL, LDAP,
db, textfiles, etc.). If a non-interactive frontend is used and the
configmodule refuses to accept the information the frontend retrieves, it can
exit with a non-zero exit code, indicating to dpkg it's not possible to install
the package with the current configurationdatabase.

6. Communication language
This communication between the frontend and the configmodule should be as
simple as possible. Since most IO implementations default to line-buffered O,
so we use a simple language where each command is exactly one line. A
prelimary list of commands is:

General commands:
  VERSION <number>
     The version-number of the communication-language the module will use.
  CAPB
     Asks the frontent for a list of capabilities. The includes
     interactiveness!  This is a two-way command: the configmodule lists it's
     capibilities as parameters, and the frontend returns it's capabilities in
     return.
  STOP
     We are finished. Store the new variables now if transactions are used
     and flush the diskcache to make sure we don't loose anything.
Interface commands:
  RESET
     Clear the accumalated set of TEXT and INPUT commands
  TEXT <string>
     Show a string to the user. The string is need not be shown until a GO
     command is given.
  INPUT <priority> <variable>
     Ask the user to enter <variable>. The description and type are retrieved
     from the variables list defined above. The frontend need only to ask this
     question if the priority is high enough.  This question is not asked
     until a GO command is given. This allows us to ask multiple questions in
     a single screen.
  BEGINBLOCK ... ENDBLOCK
     Define the beginning and end of a block of interface commands
  GO
     Show the current set of accumulated questions to the user and allow the
     user to change the answers.
Configuration space access:
  UNSET <variable>
     Remove <variable> from the configurationspace
  SET <variable> <value>
     Set a variable <variable> to <value> and unset the isdefault metaflag.
  RESET <variable>
     Restore <variable> to it's default value.
  GET <variable>
     Return the value of variable <variable>

The frontend responds to each command by returning a status code and,
if needed, extra data after the status code.

The frontend has complete responsibility for the layout of the questions,
with the exception that the ordering of interface information within a block
may not be changed.

The configuration language considers everything after a #-character on a
line as comments.

7. Extra note(s)
Some people wondered how to manage shared configuration-data such as
install-mime. Actually this is quite easy: simply call a special install-mime
which can interact with the frontend from your configmodule
and let that do the configuration.

In some cases it would be nice for the user to be able to move backwards to a
previous question of the configmodule asked (of course, it's always possible
to step back to an entirely different configmodule as well). To accomplish
this, a configmodule can use the CAPB command to tell the frontend it supports
moving backward (ie "CAPB backup"). Any script that does this should chck the
return value from the GO command to see if the frontend has returned a status
code asking it to back up a step. If so, it should back up (ie, jump back to
where it asked the configmodule to present the previous block of questions.
This obviously will make the configmodule more complicated, and won't be
needed in simple cases. The frontend's most likely response if the
configmodule indicates it has this capability is to add a "go back"
option/button to each prompt it displays.

                   Future possiblities / possible coolness
                   =======================================

* It would be nice if the configuration space could be managed via SNMP.
  This would mean keeping a central registry of assigned numbers for all
  packages. This would make `push configuration' possible: updating the
  configuration of a machine from a remote station.
* If we use push configuration, we need a way to act upon a change in the
  configuration. Creating triggers that are called when certain data
  is changed would help.
* The process of determining the desired configuration is really a dataflow
  oriented process, not a control-flow. This makes using standard script
  or a language like C somewhat awkward. What would really be great is using a
  visual language like LabView. Using the visual language you could create
  a DFA with which we can walk through the configuration process


-- 
==============================================================================
This combination of bytes forms a message written to you by Wichert Akkerman.
E-Mail: wakkerma@wi.LeidenUniv.nl
WWW: http://www.wi.leidenuniv.nl/~wichert/


--  
To UNSUBSCRIBE, email to debian-devel-request@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Reply to:
Follow-Ups:
- Re: Configuration management, version 5
  - From: Jason Gunthorpe <jgg@gpu.srv.ualberta.ca>
- Re: Configuration management, version 5
  - From: Joey Hess <joey@kitenet.net>
- Re: Configuration management, version 5
  - From: Stephen Zander <gibreel@pobox.com>
Prev by Date: finger/talk inconsistancies?
Next by Date: Re: dpkg NMU for slink
Previous by thread: finger/talk inconsistancies?
Next by thread: Re: Configuration management, version 5
Index(es):
- Date
- Thread