[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: clusters, infrastructures, and package tools

"Bud P. Bruegger" wrote:

> There are solutions out there such as Depot and Sup (see "Bootstrapping an
> Infrastructure" by Steve Traugott and Joel Huddleston, found at
> www.infrastructures.org), or there is SEPP by Tobi Oetiker, and there are
> other approaches (most of them are listed in
> http://www.sistema.it/twiki/bin/view/Main/infrastructures).  But they
> usually have their own "package format" and to install something on an
> infrastructure is the effort of creating a source package plus compliling...

The page didn't load now, but I'll check. But the idea is very similar to deb
source / binary packages I think.

> As ideal solution that we should attempt in the medium run, a package
> approach would be much easier: I envision a single source package from
> which one can automatically produce binary packages for multple processor
> architectures and base OS (Debian, Solaris, Aix, Irix, etc.).  A
> specialized (or modified Debian) package tool would install that in the
> right place (Infrastructure management is often based on a global
> filesystem where every architecture has it's subtree).  The configuration
> changes that are part of installation (in Debian done by install scripts)
> have to be made compatible with a centralized configuration management
> approach.

But that's hard to implement, because debian asumes many things while doing
the install. I mean, that's what makes it Debian. The thing is, if you really
to come up with such a feat, you'd need to *port* debian abstractions to those
other base OS's.  I suppose that's difficult, otherwise say a FreeBSD port would

be easy :) There is such a port in progress, and there is of course the Hurd,
apart from the kernel, a "base OS" has its own set of assumptions, say about
the libraries, fs layout, and such. So even if you follow GNU directory
that they devised for Makefile's, you could perhaps handle cross compiled stuff
for differenct arch's. But you need to port all the "glue" as well, to make it
run on,
for instance Windows NT! (Okay, that was a bit fancy, I admit)

Besides, the centralized config system is an established idea. I hv cfengine
installed here, and I look forward to using it for our new cluster at Bilkent.
extending arch targets for Debian packages beyond the typical would require
and immense work. And if you discarded the whole deb packaging thing, then
what's the point of using Debian as infrastructure server ?

> While I believe that some first workable solution can be found that
> requires only smally developments and modifications and mostly pulls
> together existing tools, it seems that there is no readily worked out
> solution for the problem.  My impression from the discussion on
> infrastructures.org is that may people have handrolled solutions, that
> there is a lack of collaboration and discussion, and I'm not aware of any
> package based solution...  Since I'm using exclusively Debian in my company
> (and I like it), extending the Debian approach to work in a more
> general/dishomogeneous context seems to be a good approach.

Let's see. There are lots of admins who did their own custom hacks or adapted
others, but there is not a software system that makes the whole process
more sensible, and *automated*. However, Debian already has some support for
heterogeneous networks.  That is,  Debian boxes of different arch's are usuallu
pretty compatible. Remember the Debian CLOWN cluster that consisted of
512 nodes? Was not it heterogenous?

> > Err, a heterogeneous cluster is not that different from a lab network or a
> >company's
> >intranet. So we vould view it as a cluster software setup/maintenance
> problem
> Agreed, there are many mini-infrastructures that could greatly benefit from
> an infrastructures approach.  My company has some kind of a
> mini-infrastructure problem (relatively few machines and all Debian).  But
> solving only the restricted problem would cut out a very large number of
> professional sys ops who have to deal with multiple platforms.  That would
> be a petty.  Also, I believe a possible Debian cluster project should take
> the extensive experience in the infrastructures management field into
> account (I'm therefore collecting Links on my site...).

All right, that's why I'm cross posting to debian-beowulf mailing list. Starting

with the restricted problem, nevertheless is probably the right thing to do.
room for expansion should be there :) Since it's pretty much scripting, perl,
whatever, it shouldn't hold us back.  Making use of the experience is essential.

It would also provide different point of views. While I analyze the thing from
programmer/parallel prog. researcher point of view, other views of the problem
provide much better insight to achieve a robust system.

> >   Okay, but let's not try to mod dpkg, it's already pretty loaded :)
> There were a few
> >tools which did part of that. Though a common configuration environment
> >suitable
> >for master/slave roles would be all right, and which does away with
> >problems that
> >stem for different architectures (by being conservative, and managing
> >arch-specific
> >stuff)
> Among the interesting sounding existing tools there are:
> * Depot, SEPP, pgklink, GNU stow and similar tools

I know about stow, and it does seem to follow FS standards and conventions.
/usr/local hierarchy right? Not that familiar with the rest of installation
that you mention.

> * CVSup, SUP, rsync  or alternatively (persistently) caching filesystems
> such as Coda or Inter-Mezzo (my favorites), or not as cool also AFS, DFS
> and if you really wanna suffer NFS.

Coda noted  But Linux hasn't much support for it., 2.3.42 supports client side
:( But I suppose you could get rsync to work pretty well with it. I mean, at
it does delta...  and we should have an nfs-root package somewhere...

> * Automatic installation tools such as FAI (Debian-based), of CluClo (comes
> from the Beowulf world), and some others that have not been published but
> would be available (Jon Stearley of UNM has something that he ran on Debian).

Could you tell me about these mysterious tools? I seem to recall CluClo, but I
remember FAI. It's the kind of thing I'll be needing soon. The thing is, is FAI
based, and does it let you specify options from a remote server...

> * GNU cfengine
> and maybe I forgot some important things (see
> http://www.sistema.it/twiki/bin/view/Main/infrastructures for the URLs of
> all these things).
> What is missing is:
> * to figure out how package tools fit in and how cluster
> installation/upgrade can be made easy and quick (for those who what
> this--some insist on packaging and compiling by hand).
> * how to best centrally manage configuration and make the package tools
> interact with centralized config management.

We could at the moment decide which available tools are best fit for such

> > Ooops, the larger a beowulf, the more likely that you'll have different
> >archs.
> >Even "expanding" a beowulf, say by "merging" two homogenous beowulf
> >clusters is problematic. (So, it might be cheesy to add support for cluster
> >of clusters)
> The infrastructure has more heterogeneity since it mixes in other OS
> (Solaris, AIX) on top of just multiple CPU architectures--but all under
> Linux and even Debian...
> Merging:  The infrastructures approach as I understand it stores all state
> of the cluster in a central place (usually called "gold server").  From
> there, configuration (installed binaries, config files, etc.) automatically
> propagate to all "client machines":  you add a new machine or replace one
> and a tool such as FAI boots over the network, partitions the disk, and
> installes everything necessary.  Config and changes may come down from the
> gold server via cfengine.  In this context, you can add machines to the
> cluster and they should get their changes automatically.  There is a
> minimum needed to participate in a cluster:
> * boot floppy or boot rom on a virgin machine or
> * cfengine or similar on a machine that is already installed to receive the
> config changes.
> > And probably Progeny Linux will be crying out
> >*loud* for some of the stuff you ask. :) [Or they might find flawless
> >automation
> > ]
> I searched for Progeny on the distribution page of lwn.net, didn't find
> anything.  Do you have an URL?
> >   Err, so you want debian clusters to contain non-debian nodes? Why? :)
> But I think
> >it could be supported. For instance, you could make a server for a lab with
> >both linux
> >and solaris machines.
> Most larger installations (infrastructures) have a wild mixture of machines
> for historical and/or political reasons or because some applications simply
> don't run on Linux.  And there is no choice--even if the managers of these
> infrastructures would love to have only Debian...  I believe the goal for
> Debian should be to become the most infrastructure-friendly Linux
> distribution and possibly to extend the scope of it's approach into the
> non-linux domain to bring ease and homogeneity of cluster administration.
> > If deb had good support for
> >such systems, it could gain some popularity in the eyes of managers,
> >considering
> >the cost of such services supplied by proprietary software. Business
> >boffins are
> >gonna love it! ;)
> I would and some boffin friends of mine would too :-).  Apart that this is
> not just a matter of cost--the proprietory solutions that may be out there
> may work fine in a homogeneous single vendor environment--but AFAIK there
> are few commercial solution for heterogeneous infrastructures out there.
> This is a turf that is ideal for open source.  Most open source software
> out there is multi-platform (Unix-like systems), so why artificially
> restrict Debian tools to only Linux?  As a matter of fact people already
> port them and the first moves to embrace more of the computing world are
> happening...
> --bud
> /------------------------------------------------------------------------\
> | Bud P. Bruegger, Ph.D.  |  mailto:bud@sistema.it                       |
> | Sistema                 |  http://www.sistema.it                       |
> | Information Systems     |  voice general: +39-0564-418667              |
> | Via U. Bassi, 54        |  voice direct:  +39-0564-418667 (internal 41)|
> | 58100 Grosseto          |  fax:           +39-0564-426104              |
> | Italy                   |  P.Iva:         01116600535                  |
> \------------------------------------------------------------------------/
> --
> To UNSUBSCRIBE, email to debian-devel-request@lists.debian.org
> with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org

Reply to: