Re: New CD image creation tool
On Tue, 19 Dec 2000, J.A. Bezemer wrote:
> On Sun, 17 Dec 2000, Richard Atterer wrote:
> > On Sun, Dec 17, 2000 at 01:14:06AM +0100, J.A. Bezemer wrote:
> > > - Maybe 'diff' is the wrong name here, but I can't thing of a better one at
> > > the moment.
> > I don't like it, either - hmm... maybe "image template" sounds better?
> > While we're at it, I'll call the human-readable file "location list"
> > from now on, until someone comes up with something better. ;)
> How about the concept of "cooking" a CD image? We have "special ingredients"
> (i.e. the literal binary data), a "recipe" (image template) and a "grocery
> list" of places (i.e. FTP/HTTP sites) where to get the "standard ingredients".
> Should be understandable even to people who didn't ever touch a computer
> before ;-)
I like the naming, it makes sense. It can even be extended into the domain
of downloading pre-cooked CD images.
> > [Advantages]
> > > > - By querying servers before it starts to download, the tool can
> > > > determine whether all files are actually available.
> > >
> > > You mean, downloading an ls-lR? Or querying for each individual
> > > file? (The latter wouldn't be advantageous since _if_ they exist
> > > we'll be downloading them later anyway.)
> > I was thinking of individual queries, although directory scans are
> > probably better. Why would an initial check not be an advantage? It
> > allows you to abort straight away, instead of downloading, say, a
> > hundred packages before encountering one which doesn't exist.
> I agree that checking would be an advantage, but it should not cost too much.
> ls-lR.gz is quite big (1.3M on http://ftp.us.debian.org/debian/) (and some
> mirrors have US and non-US combined in one ls-lR, others don't); scanning
> directories doesn't work with HTTP servers and with FTP (using package pools)
> it will transfer about the same size as the UNzipped ls-lR (~10M). Checking
> every single file is only easy with HTTP and will still cost ~500 bytes(?) *
> 2000 files/CD = ~1MB per CD.
Latency would be a big issue, especially for the future. As bandwidth
increases, we'll still have latency. For me, with a good network
connection, first checking and then downloading would probably take much
more time (my guess is 30% or so).
It will also put greater load on the mirrors, checking if a file exists is
almost as hard as delivering the file. But then, perhaps our mirror is
kind of special in being CPU-bound, not bandwidth-bound. :)
> Furthermore, when using pools checking is not a really big issue since
> everything should (at least _can_) still be available. Don't concentrate on it
> now, you can very well add this "new feature!" later (and I guess making it
> optional and "off-by-default" would be best for most users).
That makes sense, it is a feature that will be useful for some users, but
it is harmful for other users.
For the rest, the recipie approach makes sense. Now all that is left is
finishing a good design and implementation. ;)
/Mattias Wadenstein (admin of ftp.se.debian.org)