[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: cdn.debian.net as a project service?



On Thu, Mar 10, 2011 at 05:12:08PM +0000, Lars Wirzenius wrote:
> I like the idea of having apt choose the mirror, rather than the admins
> of cdn.debian.net. It lessens the centralization. Also, it would require
> fewer changes from mirror admins to participate.

What do you propose apt to do?

* runtime tests: costly and bogus (see below)
* geolocation: you'd need an external service to tell your routable IP, and
  after that additional step you're no better off than what cdn can do
* asking the user: sure, but it'd be nice to not have to do that

> > > Package: netselect-apt
> > > Description: speed tester for choosing a fast Debian mirror

Does it actually work for anyone?  I just tried it from machines at four
different locations on 3½ ISPs; two with IPv6+IPv4, two IPv4 only; two with
no NAT; all with full UDP/ICMP connectivity -- getting an empty result every
single time (#238888, #582976).

netselect-apt somehow requires root, I got none on a machine outside of
northern Poland so I can't test if it's something caused by a specific
mirror.

> > apt-spy is another one. It downloads Packages files from all the
> > mirrors in a region or country and reports the fastest.

Its download test has way too randomness to be of much use.  I get results
from all around half of Europe, in no test out of three ftp.pl.debian.org
won and it's just next to a router early on the path to the rest.

With bottleneck being the last mile, it's not that surprising.

Another strange thing about apt-spy is that it picks a subset of mirrors
without an apparent rule.  It claims it uses
http://http.us.debian.org/debian/README.mirrors.txt yet it ignores most (but
not all!) of non-country-primary ones.

> Packages files are pretty big, and having to download several of them is
> quite a stumbling block. If I'm on a metered connection, I don't want to
> have to download several unnecessary megabytes just to see which mirror
> is fastest.

As apt-spy shows, you'd have to download far more than just a 6MB file to
beat the variance.  At least, it approximates network distance worse than
geolocation.
 
> Ping times, of course, are not a particularly good way to pick a mirror,
> either.

Yeah, they seem to be more reliable than a short download test though, even
if for a wrong reason.

> On the other hand, we don't necessarily have to be able to pick the
> optimal mirror. It's probably good enough to pick a good one.

Right, with bottlenecks usually being at the last mile, all good mirrors
will have nearly the same speed.  Picking any from the general area should
be fine.


> Also, picking the mirror should be just one function (or class or module
> or whatever) in apt, so everything else could be implemented independent
> of the choice of heuristic to choose a mirror.
> 
> Would someone be willing to mentor a GSoC project for this? Or do it
> themselves?

The current state is:
* manual selection, or
* geolocation (cdn.debian.net)

None of alternatives proposed is any better than these two -- and with
apt-spy, significantly worse.  Thus, I'd say there are so many better uses
for GSoC folks.

Blessing cdn.debian.net as the default choice seems to be the best solution
to me.  Geographical proximity is not the same as network proximity but
approximates it well enough.

-- 
1KB		// Microsoft corollary to Hanlon's razor:
		//	Never attribute to stupidity what can be
		//	adequately explained by malice.


Reply to: