[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: clusters, infrastructures, and package tools

At 04:48 PM 09-02-00 +0000, Eray Ozkural wrote:
>"Bud P. Bruegger" wrote:
>> There are solutions out there such as Depot and Sup (see "Bootstrapping an
>> Infrastructure" by Steve Traugott and Joel Huddleston, found at
>> www.infrastructures.org), or there is SEPP by Tobi Oetiker, and there are
>> other approaches (most of them are listed in
>> http://www.sistema.it/twiki/bin/view/Main/infrastructures).  But they
>> usually have their own "package format" and to install something on an
>> infrastructure is the effort of creating a source package plus
>The page didn't load now, but I'll check. 

The server seems to be ok; the CGI that converts sdf source to html is kind
of slow...  Let me know if this continues to be a problem and I send the
page in HTML by e-mail.  There are all the annotated bookmarks of the
important stuff that I found related to infrastructures..

>But the idea is very similar to deb
>source / binary packages I think.

The package formats that I looked at are very similar...  All slight
variations of the same.  That's why I believe that universal source
packages (www.sistema.it/univSrcPkg/) are feasible.  

>> As ideal solution that we should attempt in the medium run, a package
>> approach would be much easier: I envision a single source package from
>> which one can automatically produce binary packages for multple processor
>> architectures and base OS (Debian, Solaris, Aix, Irix, etc.).  A
>> specialized (or modified Debian) package tool would install that in the
>> right place (Infrastructure management is often based on a global
>> filesystem where every architecture has it's subtree).  The configuration
>> changes that are part of installation (in Debian done by install scripts)
>> have to be made compatible with a centralized configuration management
>> approach.

>But that's hard to implement, because debian asumes many things while doing
>the install. I mean, that's what makes it Debian. 

Agreed. What I think is best doing is hiding such things behind a "neutral"
API and have the backend do it the debian way.  This includes things such
as:  how to make a cron entry, add s.th. to /etc/inetd.conf, add something
for init, where binaries, config, man pages, etc. go, etc.  

>The thing is, if you really
>to come up with such a feat, you'd need to *port* debian abstractions to
>other base OS's.  

I agree depending of what you mean by "port".  If you say that all APIs
mentioned above need to be implemented on every platform, I agree.  If you
say that everything has to be implemented the debian way, I disagree:  If
you want to install a package on say Solaris, you have to live with the
decisions that have already been made for this platform (Otherwise you can
install Debian/Linux in the first place).  But for example a backend that
unpacks files from an package can easily put binaries in /usr/bin in Debian
and /opt/bin, /usr/pack, or whatever on some other platform.  Similarly,
how the init scipts are organized can be handled by different
implementations of the abstract API for init scipts...

>I suppose that's difficult, otherwise say a FreeBSD port would
>be easy :) There is such a port in progress, 

I don't just have the port of dpkg/apt to other platforms in mind, I'm
proposing to modify the debian source format to consistently use some
standard (yet to define but close to current practise) APIs as mentioned
above.  I believe that this makes porting much easier since you know
exactly what has to be done (i.e., implement the APIs) and you can rely on
getting source packages of a much more narrowly defined format that is
currently the case.  

For what the current debian domain is concerned, it should be possible to
use this stricter and more precisely defined source package format to build
ordinary binary debian packages.  But it makes it also possible to
automatically generate binary BSD packages (that are installed by a ported
dpkg) that function in an environment with different file hierarchy
decisions, init setup, cron management of BSD... 

>and there is of course the Hurd,

I assume that changing just the kernel won't make much difference to
Debian/Linux, or does it?

>apart from the kernel, a "base OS" has its own set of assumptions, say about
>the libraries, fs layout, and such. 

Yes.  Libraries are probably mostly taken care of during complilation and
doesn't autoconf help a big deal here to make it compile on most any
platform?  FS layout is different, but I believe from a source package that
adhers to some standard (FHS), it should be possible to automatically
convert it to some other fs layout:  Before running make (or similar),
substitute all /etc/...  with /opt/etc/ in the Makefile, etc...

>So even if you follow GNU directory
>that they devised for Makefile's, you could perhaps handle cross compiled
>for differenct arch's. But you need to port all the "glue" as well, 

Yes, I call the glue "implementation of the API" and there are possibly
also some tools that have to be created (source to binary package
conversion etc.).  

But, this is done once per platform, not once per package or new version of
a package!!!  This makes it much more economical from a social point of
view:  If you package only once per package (at source level), instead of
many times as is done today (Linux(Debian, RPM,..), other OS
(Solaris,AIX,HP-UX,..), infrastructure "packages"(SEPP, Depot,
DebianCluster??), but you have some additional effort per platform/OS, you
save enormously!!!  This is because there are so many more packages *
versions than platforms/os...  

This calculation probably doesn't click if you are limiting your domain to
a single distribution (Debian).  From this point of view, all that you gain
is to have tons of additional packages from other distributions that you
didn't have the resources to prepare.  But I believe that source packaging
should be seen in the domain of the original author in the same way as
autoconf.  For me, the distribution domain should start with building
binary packages from universal source packages...  

>to make it
>run on,
>for instance Windows NT! (Okay, that was a bit fancy, I admit)

If someone wants to use Unix-style source packages for easier porting to
NT, that's ok with me--but I wouldn't make any effort for this.  (That's
probably VERY hard... and hopefully NT will just disappear anyways).

>Besides, the centralized config system is an established idea. I hv cfengine
>installed here, and I look forward to using it for our new cluster at

>From what I've read so far about cfengine, it is a possible basis for
implementation but not a solution.  Or more precisely, it is a solution for
the areas where it has specialized modules (network config, NFS, etc) but
is a framework for everything else.  For my taste, it's editing oriented
approach to managing configuration files (the ones not supported by
specialized modules) is too procedural.  I prefer a more declarative
appraoch.  I'm thinking of a "source" configuration file from which a
preprocessor creates the actual machine-specific files.  The preprocessor
can filter out sections of the file that does not apply to the host at hand
and can substitute some variables such as $hostname... 

>extending arch targets for Debian packages beyond the typical would require
>and immense work. 

This depends on your point of view;  as reasoned above in a larger context
this saves an immense amount of effort. 

>And if you discarded the whole deb packaging thing, then
>what's the point of using Debian as infrastructure server ?

I by no means discard deb packaging.  After all I believe it's the best out
there.  I would like to see the ideas behind debian source packages evolve
to make them more universally usable.  I don't see any need of changing the
tools for binary packages.  I see a need within Debian to provide tools and
means that work great for clusters, not just single machines.  And that
requires change (we're not there yet!).  In respect to source packaging,
the change that I believe is necessary for supporting cluster solutions
well is basically the same as that for universal source packages.  So
taking the latter idea on board makes the whole operation much more

>Let's see. There are lots of admins who did their own custom hacks or adapted
>others, but there is not a software system that makes the whole process
>more sensible, and *automated*. However, Debian already has some support for
>heterogeneous networks.  That is,  Debian boxes of different arch's are
>pretty compatible. Remember the Debian CLOWN cluster that consisted of
>512 nodes? Was not it heterogenous?

Heterogeneous = multiple processor architectures + multiple base OS. 

>All right, that's why I'm cross posting to debian-beowulf mailing list.
>with the restricted problem, nevertheless is probably the right thing to do.
>room for expansion should be there :) Since it's pretty much scripting, perl,
>whatever, it shouldn't hold us back.  Making use of the experience is
>It would also provide different point of views. While I analyze the thing
>programmer/parallel prog. researcher point of view, other views of the
>provide much better insight to achieve a robust system.

I agree that Beowulf/Mosix people can bring in a lot of interesting
experience and tools (eg. FAI).  A beowulf is by definition more homogenous
(e.g., only Linux) and often there is a more direct control over the
machines (e.g., it is less likely that some nodes are down during
installation or re-configuration).  Beowulf administration tools often use
push techniques (i.e., execute foo on every node) that are unmanageable in
infrastructures where only pull methods are sustainable.  (NB: cfengine is
a strong proponent of pull methods!).

Starting with a restricted problem is the right approach if one doesn't
block the way to future generalization/extension...

>I know about stow, and it does seem to follow FS standards and conventions.
>/usr/local hierarchy right? Not that familiar with the rest of installation
>that you mention.

Let me know if my link page is unreachable and I'll mail it out...

>> * CVSup, SUP, rsync  or alternatively (persistently) caching filesystems
>> such as Coda or Inter-Mezzo (my favorites), or not as cool also AFS, DFS
>> and if you really wanna suffer NFS.
>Coda noted  But Linux hasn't much support for it., 2.3.42 supports client

We've just installed Coda on a 2.2.14 kernel to try out how usable/stable
it is.  NFS will make life much more difficult...

>:( But I suppose you could get rsync to work pretty well with it. I mean, at
>it does delta...  and we should have an nfs-root package somewhere...

Most of the individual tools will be easy to get running/packaged on Debian
and some already exist.  The big question is which approach to chose
(global file system vs pulling down files over the network, version
controlling all files or just config files, etc. etc.)

>Could you tell me about these mysterious tools? I seem to recall CluClo,
but I
>remember FAI. It's the kind of thing I'll be needing soon. The thing is,
is FAI
>based, and does it let you specify options from a remote server...

.. my links page..

>We could at the moment decide which available tools are best fit for such

That's what I've been trying to do and I documented it in a very brief
format at  /twiki/bin/view/Main/CodaDebInfra

Sorry, got somewhat long.


| Bud P. Bruegger, Ph.D.  |  mailto:bud@sistema.it                       |
| Sistema                 |  http://www.sistema.it                       |
| Information Systems     |  voice general: +39-0564-418667              |
| Via U. Bassi, 54        |  voice direct:  +39-0564-418667 (internal 41)|
| 58100 Grosseto          |  fax:           +39-0564-426104              |
| Italy                   |  P.Iva:         01116600535                  |

Reply to: