I just dug this one up from my mailfolders. Please read it carefully, since a lot of work and discussion has already gone into this. Wichert. Configuration Management revision 5 While sailing (mostly motoring actually, darned wind let us down again) across the channel I have been looking through my maillog and have revised the design again. First, a couple of remarks about the changes: * seperation of variables and the script is indeed a good concept, so I added that to the design. * all proposals for a file with the variable list share 1 thing: the priority of a variable is given in this list. I don't think that is reasonable though: we can't always know if a variable is important until we know the conditions. For example: if you configure sendmail as a nullclient, it is critical to know the smarthost. In all other situations knowning the smarthost is not important at all. * This gives us two options: either add some algebra to the variable list to calculate the important from other variables, or let the script do it. I think the last choice is better. * Getting data from multiple databases would be great. It occured to me we already do this somewhere else: the service switch in glibc2 is a good example of how to do this. We can expand that approach and use it here. * This text is based on the changed version Joey Hess posted. Managing configuration data =========================== 1. The configuration space All configuration information is stored in what I call the configuration space. This is a database with a special design which resembles to method we look at configuration information. This is done by defining a hierarchy of information. Each package receives its own space in the hierarchy. Each package is free to use a flat space, or divide it's space further into subhierarchies. If multiple packages share a common purpose they may use a shared toplevel hierarchy, preferably with the same name as a shared (virtual) packagename (for example, both mutt and elm can use mail-reader, strn an nn could use news-reader). This shared tree can also be used as a default, ie a variable news-reader/nntpserver can be used by strn if strn/nntpserver does not exist. 2. Types of variables Multiple types of variables can be stored in the configuration space. A preliminary list of types is: strings, numbers, lists, hostnames, IP addresses. Each variable can have meta-data associated with it for special purposes. The minimum meta-data associated with a variable is: long and short description, type, default value and an `isdefault' flag. The `isdefault' flag states if a variable has been changed from it's default value. This can be used when upgrading a package to check if the user has changed the default, or it's safe to change it to a new default. This gives us the same result as the md5sum-checking dpkg does for conffiles, but on a much finer-grained level (per variable instead of per file). 3. Accessing the configuration The configuration may be spread across multiple databases. We use a virtual database to represent these databases as one big database. We use the same method as the service switch in glibc2: we have a file in which we define the databases and their entry-points, and the search order we use to access the databases. An example: ----------------------------------------------------- # First define all databases we use database "companycfg" { # Company-wide database driver "Oracle"; # Use an oracle-database instance "Config"; # Oracle-specific parameters user "config" password "config"; root "/common/$(ARCH)/"; # Starting-point for database access }; database "departmentcfg" { # Department-wide database driver "LDAP"; # Use a LDP database root "/config/$(ARCH); # Starting-point for database access }; database "hostspecs" { # Database for host-specific config driver "DHCP"; # Use DHCP-minidriver }; database "local" { # Database to store local configuration driver "DebianDB"; # Use our own database file "/var/state/config/database"; }; # Define the search-orders: order { retrieve { local, hostspecs, departmentcfg, companycfg }; # Only store data in the local database store { local }; }; ----------------------------------------------------- (This format is heavily based on the bind named.conf format: it's quite easy to parse and very flexible.) . We start by defining the databases to use. Each database has a driver to use and a root from which we start looking. Variables such as $(HOSTNAME) and $(ARCH) can be used here. Each driver may also add other variables, like instance, user and password for the Oracle-example given here. Some database may have mini-drivers, like the DHCP-database: they can define only a couple of variables like IP-address, hostname, etc. and return a not-found error on all other requests. Install-time configuration ========================== We want to make a package which does not break older dpkg's, and we want to be able to get the configuration information before the package is unpacked. To do this we add a new file, config.tar.gz, to the package besides the current control.tar.gz and data.tar.gz . Since all installation-software (apt, dselect, dpkg) download the package before installing it, we can extract this before the package is unpacked. Since older dpkg's will not process the extra file, we can do to things: either create an extra assertion "--assert-configmodule" in dpkg which is checked in the preinst, or up the versionnumber of the package. 1. The variable definition file This file, named `variables', is contained inside the config.tar.gz . It is a simple text file that defines each variables used by the package's configuration module (see below). This information is merged into the configuration sapce when a package is installed or upgraded, and may be removed when the package is purged. The format of this file parallels a Package file and is as follows: Variable: <variable name> Type: <variable type> Default: <default value> Description: <short description> <long description of variable> Variable: ... 2. The configmodule do The configuration module is a executable named `config' in the config.tar.gz file. The configmodule is the part of a package that will determine the configuration before the package is installed. This means it is run _before_ the preinst, and before the package is unpacked! Unless pre-depends are used, this will mean that the module can only assume the base-system is installed. 3. How does the configmodule get it's information? The configmodule needs a way to retrieve information from the configuration space, ask the user for information if necessary, etc. But we don't want to implement a user interface for each package. To solve this we use a seperate frontend, which provides the configmodule with a method to access the configuration space and interact with the user. 4. How do the configmodule and the frontend interact? Of course the configmodule and the frontend must exchange data to do their work. We do this in a very simple manor: dpkg starts both the configmodule and the frontend, and connect the stdin/stdout from the module to the stdout/stdin of the frontend. We can then use stdin/stdout to communicate, while still having stderr available to report errors. 5. The frontend There are two types of frontends possible: interactive and non-interactive. Interactive frontends allow the user to answers questions and see messages. Non-interactive frontends get all information from a database (SQL, LDAP, db, textfiles, etc.). If a non-interactive frontend is used and the configmodule refuses to accept the information the frontend retrieves, it can exit with a non-zero exit code, indicating to dpkg it's not possible to install the package with the current configurationdatabase. 6. Communication language This communication between the frontend and the configmodule should be as simple as possible. Since most IO implementations default to line-buffered O, so we use a simple language where each command is exactly one line. A prelimary list of commands is: General commands: VERSION <number> The version-number of the communication-language the module will use. CAPB Asks the frontent for a list of capabilities. The includes interactiveness! This is a two-way command: the configmodule lists it's capibilities as parameters, and the frontend returns it's capabilities in return. STOP We are finished. Store the new variables now if transactions are used and flush the diskcache to make sure we don't loose anything. Interface commands: RESET Clear the accumalated set of TEXT and INPUT commands TEXT <string> Show a string to the user. The string is need not be shown until a GO command is given. INPUT <priority> <variable> Ask the user to enter <variable>. The description and type are retrieved from the variables list defined above. The frontend need only to ask this question if the priority is high enough. This question is not asked until a GO command is given. This allows us to ask multiple questions in a single screen. BEGINBLOCK ... ENDBLOCK Define the beginning and end of a block of interface commands GO Show the current set of accumulated questions to the user and allow the user to change the answers. Configuration space access: UNSET <variable> Remove <variable> from the configurationspace SET <variable> <value> Set a variable <variable> to <value> and unset the isdefault metaflag. RESET <variable> Restore <variable> to it's default value. GET <variable> Return the value of variable <variable> The frontend responds to each command by returning a status code and, if needed, extra data after the status code. The frontend has complete responsibility for the layout of the questions, with the exception that the ordering of interface information within a block may not be changed. The configuration language considers everything after a #-character on a line as comments. 7. Extra note(s) Some people wondered how to manage shared configuration-data such as install-mime. Actually this is quite easy: simply call a special install-mime which can interact with the frontend from your configmodule and let that do the configuration. In some cases it would be nice for the user to be able to move backwards to a previous question of the configmodule asked (of course, it's always possible to step back to an entirely different configmodule as well). To accomplish this, a configmodule can use the CAPB command to tell the frontend it supports moving backward (ie "CAPB backup"). Any script that does this should chck the return value from the GO command to see if the frontend has returned a status code asking it to back up a step. If so, it should back up (ie, jump back to where it asked the configmodule to present the previous block of questions. This obviously will make the configmodule more complicated, and won't be needed in simple cases. The frontend's most likely response if the configmodule indicates it has this capability is to add a "go back" option/button to each prompt it displays. Future possiblities / possible coolness ======================================= * It would be nice if the configuration space could be managed via SNMP. This would mean keeping a central registry of assigned numbers for all packages. This would make `push configuration' possible: updating the configuration of a machine from a remote station. * If we use push configuration, we need a way to act upon a change in the configuration. Creating triggers that are called when certain data is changed would help. * The process of determining the desired configuration is really a dataflow oriented process, not a control-flow. This makes using standard script or a language like C somewhat awkward. What would really be great is using a visual language like LabView. Using the visual language you could create a DFA with which we can walk through the configuration process -- ============================================================================== This combination of bytes forms a message written to you by Wichert Akkerman. E-Mail: wakkerma@cs.leidenuniv.nl WWW: http://www.wi.leidenuniv.nl/~wichert/
Attachment:
pgpDGhaQFJFLf.pgp
Description: PGP signature