[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Rambling apt-get ideas




With the raging flame war going on about MUAs, I'm embarassed to mail this
with lotus notes, but, hey, its all I have at work.

Back on topic, I would have thought that package distribution was a one
time shot.  Caches are for people who would otherwise download the
slashdot.org header graphic fifty times a day.  Whereas each individual
debian machine should only have to download the latest perl .deb once in
it's "life".  If I apt-get upgrade through my http cache, all I do is flood
the cache with megs of data I'll never download again.

I'm not sure about the overhead is minimal for less than a thousand
clients.  I have 18 or so debian workstations at work.  If it takes 5
minutes to transfer all the .debs to upgrade one machine, then I think it
would take a unicast system slightly less than 18*5 minutes (about 1 1/2
hours) to upgrade, vs 5 minutes for a multicast system to upgrade.  A
unicast upgrade could be an "start it and go to lunch" process whereas a
multicast upgrade would be a "get a cup of coffee" process.  If I had a
hundred machines to upgrade, the comparison would be even greater.  Yeah,
wasting 17*5 minutes is not the end of the world, but why not try harder to
do better?

The concept of the system I'm discussing, is one "master" machine downloads
the .deb via http.  Then it multicasts the .deb to all the other machines
at once.  All of them are on the same subnet so some variety of layer 2
multicast / broadcast would work, although it would be nice to go beyond
the subnet if necessary.

I agree that the discussion about new installs points out that sometimes,
"pull" based systems have an advantage.  I'm pointing out that sometimes,
"push" based systems have an advantage.  And I'm motiviated because I
believe my situation at work is one of those situations where "push" is the
better answer.



                                                                                                                    
                    Matt                                                                                            
                    Zimmerman            To:     debian-devel@lists.debian.org                                      
                    <mdz@debian.o        cc:     (bcc: Vince Mulhollon/Brookfield/Norlight)                         
                    rg>                  Fax to:                                                                    
                    Sent by: Matt        Subject:     Re: Rambling apt-get ideas                                    
                    Zimmerman                                                                                       
                    <mdz@alcor.ne                                                                                   
                    t>                                                                                              
                                                                                                                    
                                                                                                                    
                    01/04/2001                                                                                      
                    01:45 AM                                                                                        
                                                                                                                    
                                                                                                                    




On Fri, Dec 29, 2000 at 11:11:01PM +0100, mechanix@digibel.org wrote:

> Why not look at this from a different perspective? I don't know if it may
be
> useful or not for upgrading machines, but the multicast server would be a
> very nice thing for mass installations.

I still disagree.  Multicast is the wrong solution.  Multicast data is
basically equivalent to a cache with zero object TTL.  Packets (objects)
are
stored (by a network device) until a client needs them (immediately), at
which
point they are served (multicasted/broadcasted) and expired (discarded).
Why
not replace this with a _real_ cache, which can store objects for a
user-definable period of time, allowing for later operations to benefit
from
the cache?

It _might_ be useful to use multicast for a system which would trigger a
bunch
of daemons to all download the same data from the cache, but even this is
doubtful.  Unless you are dealing with many thousands of clients, the
overhead
for sending individually-addressed packets to the clients is minimal.

This is definitely a "pull" problem rather than a "push" problem.  Say a
system
is being installed in a new location, which has network connectivity, but
no
user consoles (yet).  Why should the admin have to find a live terminal in
order to tell the server to initiate a multicast installation?  Why not
just
have the bootstrap disk fetch the necessary data?

> Image large computer rooms at a lan with (usually) uniform hardware. If
there
> was a package (say apt-getd) that could be installed on one, already
> running box, which lets you make a special boot disk. The machine that
runs
> apt-getd has a way to get to a debian archive (be it local mirror, a set
of
> cdroms - this would probably be a bit harder with cd swapping - or a
mirror
> on the larger network that it is connected to).

Better to separate automatic system building/configuration (a very hard
problem
with relatively little progress) from efficient hierarchical file
distribution
(an easier problem, with many good, stable tools already released).

> You boot with the floppy that configures the network, apt-getd starts
> spawning multicast packets, the workstations pick them up and install
them.
> Voilà! You just installed an entire network!

Also consider:

You boot with the floppy that configures the network, start downloading
files
over TCP, the workstations install them.
Voilà!  You just installed an entire network!

This approach will also work with heterogenous hardware, which is an issue
in
a majority of enterprises.

--
 - mdz
(See attached file: att5h9ug.dat)

Attachment: att5h9ug.dat
Description: Binary data


Reply to: