Re: Modifying Debian for Infrastructures--Step 1
>> "Bud P. Bruegger" <bud@sistema.it> writes:
> For installing packages on a cluster of machines, we chose to
> install to a globally visible filesystem. There are directory
> subtrees for different versions of the same package and for
> different architectures.
The different architectures I can understand but why the different
versions? In the best case, that's a really quick path leading to
trouble.
> The individual machines use sym-link farms (created with slink or
> stow) to run these packages. In a first step we would like to
> modify source packages such that the installation directories
> become parametrized and choosable at build time as a command line
> option.
although somewhat desirable, it's not always easily achievable
without major effort. For example, I'm toying with the idea of
packaging Cactus, but it has a build system which is really
convenient for those who want it "working right now" (with whatever
configuration and file layout upstream chose), but it's a real
nightmare in the context of Debian Policy.
> o We believe that using SED or similar to substitute all
> /usr/bin, /etc, /usr/doc/, etc with some kind of a variable
> would be the right approach. This makes it irrelevant
> whether a certain path is coded in Makefile, a Makefile.in,
> a c source file, or whatever else.
Right away I can think of two cases where this approach fails: lam
(which I beleive Camm already pointed out) and pvm (or at least that
was the case the last time I took a look at it). The reason is
exactly what you point out:
> o The only caveat we could think off was quoted slashes
> "\/usr\/bin" and path that are constructed from multiple
> variables. The former case could possibly be automated; we
> don't see a solution for the latter.
that's we still have 'a job', to bend some authors' ideas of correct
filesystem layout to Debian's Policy dictated layout. Even with
fully autoconf/automake packages this is troublesome, because GNU
standards diverge from the FHS in some significant ways (/etc and
/var are perhaps the most notorious)
> o which are the path that have to be substituted? Where is the
> Debian doc that lists them all?
Look in the debian-policy package, locations are mostly what the FHS
dictates.
Perhaps if you explain exactly what you want to achieve (distributed
computing, diskless workstations, an heterogenous (from the hardware
POV) cluster, an heterogeneous (from the OS POV) cluster, ???)
Depending on what that is, dpkg --root=/foo might do the trick, in
particular dpkg --root /usr/lib/pckg/<arch>. Keep in mind the
pre/post scripts are run chrooted here, so that means you (might)
need a more or less complete system there. In particular, there will
be a `second' admin dir for dpkg there, so dpkg will complain about
dependencies and such. This approach works if you want to work with
diskless nodes (we are talking clusters here, aren't we?). A second,
somewhat different strategy is to have enough extra ram on the nodes
and load a ramdisk image from a central server. A third approach
(which I like better) is to spend a few extra money on small hardisks
(something in the order of 2 GB being the smallest you find nowadays,
you get them for US$80 or less), and deal with the problem of keeping
the nodes in sync. Some black magic with TFTP, ramdisks, DHCP/BOOTP,
multicast and such is in order for a really effective solution here,
and I'd love to see such a tool available in Debian. Perhaps you'd
like to redirect your efforts in this direction?
Cheers,
Marcelo
Reply to: