Re: cdn.debian.net as a project service?
- To: debian-devel@lists.debian.org
- Subject: Re: cdn.debian.net as a project service?
- From: Raphael Geissert <geissert@debian.org>
- Date: Sun, 26 Feb 2012 15:37:37 -0600
- Message-id: <[🔎] jie8n6$2l8$1@dough.gmane.org>
- References: <1299747535.3524.12.camel@havelock.lan> <20110310092229.GA7584@anguilla.noreply.org> <20110310103637.GH18778@ftbfs.de> <im4ant$8d2$1@dough.gmane.org> <20110321143856.GA8521@localhost>
[Yay for replying nearly a year later. Sorry about that, I completely missed
your email.]
Michael Vogt wrote:
> On Sun, Mar 20, 2011 at 01:35:23AM -0600, Raphael Geissert wrote:
>> P.S. apt also provides a "mirror" method (just like http, ftp, etc) but I
>> consider it to be suboptimal and a poor way to tackle the problem.
>
> Why exactly do you feel this way? What problems do you see with it?
First of all, I dislike the idea of introducing another method just to
provide some features that can already be implemented on top of http.
By introducing another method, it requires people to migrate to it. There's
no backward compatibility, and there's no way for people to benefit from it
without changing their settings.
Additionally, there are some issues that would need to be taken into
consideration:
* Mirrors go offline
* They become out of date
* They get re-routed
* They not all are synced in a safe way
* They not all are synced at the same time
* etc
They are dynamic.
All those issues are either temporary or their effects are. By using a
static list that is only updated at apt-get update time, users may be
affected by one or more of the above issues and an apt-get update may not
even solve it.
I have recently spent some more time working on my proposed solution (and on
general issues of the mirrors network), to the point of setting up a test
instance at http://http.debian.net/
Given that it is fairly easy to reuse the code of the redirector to instead
print a list of mirrors, I've added support for the mirror:// method, as
implemented in experimental[1].
[1] it is required so that it knows what architectures the user needs.
Although it is possible for me to drop that requirement, I haven't spent
time on it.
You may give it a try by using:
mirror://http.debian.net/debian.list
> I see some benefits in the mirror method like that the mirror list is
> cached locally (on apt-get update) so the mirror service is hit less
> often. Also if the central service is down apt can still use the
> cached local info. Plus it supports automatic retry if a mirror fails.
Those features are certainly interesting. However, I'd like to see them in
the http method so that the solutions are backwards compatible.
For instance, the redirector supports listing alternative locations of a
requested file by sending a Link header as specified in RFC6249 (minus the
hash sum part). It is currently disabled on the live instance because: a)
nobody uses it, b) breaks apt because apache concatenates the Link headers,
instead of sending them individually, which reaches apt's max header length
limit.
You may however see them when sending a HEAD request.
As for avoiding the extra round-trip time to the redirector for every
request, I've the following proposal. I must say I personally would prefer
not having to do any of this and I consider it rather unecessary.
* APT sends the first request to a $host with the following headers:
X-APT-Arch: $architectures (space separated, for example)
X-APT-Supports: base
Requesting, for instance, /dists/sid/InRelease
* If $host (which might be the redirector) recognises the headers, it may
reply with:
X-APT-Base-Repository: http://ftp.xx.debian.org/debian /dists/sid/InRelease
Meaning that APT may issue requests directly to ftp.xx.debian.org, as long
as the architectures it listed in the first request are the only ones
needed. The URL needs to be split because the http method is agnostic as to
how the URLs are built, and for replacing the following requests it would
need to strip everything that matches until the first white space.
* If $host doesn't recognise them, or simply omits any special header, APT
will no longer send X-APT headers for that host, as long as the http process
remains alive.
That way, the round trip to the redirector is avoided, but only for the
duration of the life of http process. There would still be a window of time
during which the mirror may fail causing the locally-rewritten requests to
fail, but it would be smaller than when using mirror://.
The redirector could send more than one X-APT-Base-Repository header for APT
to gather a list of alternative mirrors to choose from. Then further
requests could be randomly sent to different mirrors.
Etc.
Before going into such kind of extensions, I'd prefer if people actually
tried the redirector. I personally haven't seen any significant delay.
Cheers,
--
Raphael Geissert - Debian Developer
www.debian.org - get.debian.net
Reply to: