[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: apt PARALLELISM



On Mon, 12 Dec 2005, Martijn van Oosterhout wrote:
> 2005/12/12, Henrique de Moraes Holschuh <hmh@debian.org>:
> > We don't want them to open multiple connections even to MULTIPLE
> > servers...
> 
> That's odd though, because apt *does* open connections to multiple servers
> all the time. To fetch packages lists, or if a package is only available on
> one of the servers further down.

Yah.  It is supposed to be the lesser of two evils, I think.  With the new
differential packaging lists, it will actually be a proper balance between
mirror load and user experience (see apt in experimental).

> Secondly, the amount of data to be downloaded is independant of the time it
> takes, thus, in aggregate, whether apt parallelizes or not won't make any
> difference to the total bandwidth used, although it may shift more load to
> the ftp2 servers since they never get used in normal usage.

It will make difference for people trying to download at the same time. I
have made this point a number of times again.

Let me get it clear:

1. We care about a large lot of people a lot more than we care for an
   individual's downloading speed

2. Thus we try to keep the mirror load down, and downloading hundreds of
   megabytes using multiple connections to multiple sources of the same file
   is heavily fronwed upon.

3. If one manages to prove that the best way to archieve (2) is through n
   parallel connections to the same mirror, or to n parallel connections to
   different mirrors, be my guest.

> Finally, how much of these slowdowns reported by people are caused by the
> bandwidth delay product. In that case, two servers will definitly be able to

I won't claim this is what happens in Internet-paradise countries, but here
there are two things that affect download speed the most after you get a
last-mile link that is not too slow (say, 384kbit/s or more):

  TCP/IP (roundtrip, packet loss, windows)
  ISP "backbone" link congestion

Whichever one is the worst bottleneck changes along the day.  When the
cable/ADSL/radios are unstable, packet loss can cause staggering slowdowns.
During the more busy hours, the backbone limitantion becomes apparent.
During the wee hours of the night on long holidays, TCP/IP is the one
limiting the speed of a single connection.

> use more than a single server by itself... I didn't think it common practice
> for large mirror to configure multi-megabyte windows...

TCP/IP windows?  Or user bw shaping?

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh



Reply to: