[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: apt PARALLELISM



Scripsit Henrique de Moraes Holschuh <hmh@debian.org>

> 1. We care about a large lot of people a lot more than we care for an
>    individual's downloading speed

> 2. Thus we try to keep the mirror load down, and downloading hundreds of
>    megabytes using multiple connections to multiple sources of the same file
>    is heavily fronwed upon.

Of course, trying to download the _same_ file from several different
servers simultaneously would be very wasteful. However it seems to be
not what the proposal in this thread is about.

As far as I read the proposal, it is about downloading _different_
files from different mirrors - if you have 25 packages to get for your
'apt-get update' operation, download 5 packages from each of 5
different servers, with one connection to each server active at a
time.

While I cannot see any very common situation where such parallellism
would be an advantage, it is not clear that it would increase the load
of any or all servers.

At least, I cannot see that there would be any ill effects of a
hypothetical pseudo-parallel implementation that downloads 5 packages
from each of the 5 servers, but sequentially such that only a single
connection to a single server is active. And the difference from
_that_ to an actual parallel implementation is just to shift the
connections each server experiences a bit in time - the number of KB
served by each server stays constant.

Is your point that a server prefers to push bytes through the
connection at a constant rate, and starts wasting resources if the
available bandwidth fluctuates because the last-mile ADSL has to be
shared with a shifting number of parallel downloads from other
servers? But when the bottleneck is closest to the client, enabling
parallel downloads would not make much sense anyway.

(Of course, Goswin has a valid point that some people have their
sources.list deliberately written with a remote, undesirable, server
at the end as a _fallback_ option. Therefore parallelism should at
best be an _option_, not something that apt starts doing unbidden).

> I won't claim this is what happens in Internet-paradise countries, but here
> there are two things that affect download speed the most after you get a
> last-mile link that is not too slow (say, 384kbit/s or more):

I have 768 kb/s at home, and my apt updates through that pipe operate
close to its peak capacity. But they are at least one order of
magnitude slower than from my desk at work (which is just two or three
100+ Mb/s hops away from the national research backbone). Same mirror
in both cases.

>From that experience, a last-mile link in the 1 Mb/s range would still
seem to be the limiting factor - and therefore people at the end of
such links would have little use for parallelism in the first place.

-- 
Henning Makholm                          "What has it got in its pocketses?"



Reply to: