[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: cdn.debian.net as a project service?



[Yay for replying nearly a year later. Sorry about that, I completely missed 
your email.]

Michael Vogt wrote:
> On Sun, Mar 20, 2011 at 01:35:23AM -0600, Raphael Geissert wrote:
>> P.S. apt also provides a "mirror" method (just like http, ftp, etc) but I
>> consider it to be suboptimal and a poor way to tackle the problem.
> 
> Why exactly do you feel this way? What problems do you see with it?

First of all, I dislike the idea of introducing another method just to 
provide some features that can already be implemented on top of http.
By introducing another method, it requires people to migrate to it. There's 
no backward compatibility, and there's no way for people to benefit from it 
without changing their settings.

Additionally, there are some issues that would need to be taken into 
consideration:
* Mirrors go offline
* They become out of date
* They get re-routed
* They not all are synced in a safe way
* They not all are synced at the same time
* etc

They are dynamic.

All those issues are either temporary or their effects are. By using a 
static list that is only updated at apt-get update time, users may be 
affected by one or more of the above issues and an apt-get update may not 
even solve it.

I have recently spent some more time working on my proposed solution (and on 
general issues of the mirrors network), to the point of setting up a test 
instance at http://http.debian.net/

Given that it is fairly easy to reuse the code of the redirector to instead 
print a list of mirrors, I've added support for the mirror:// method, as 
implemented in experimental[1].
[1] it is required so that it knows what architectures the user needs. 
Although it is possible for me to drop that requirement, I haven't spent 
time on it.

You may give it a try by using:
mirror://http.debian.net/debian.list

> I see some benefits in the mirror method like that the mirror list is
> cached locally (on apt-get update) so the mirror service is hit less
> often. Also if the central service is down apt can still use the
> cached local info. Plus it supports automatic retry if a mirror fails.

Those features are certainly interesting. However, I'd like to see them in 
the http method so that the solutions are backwards compatible.

For instance, the redirector supports listing alternative locations of a 
requested file by sending a Link header as specified in RFC6249 (minus the 
hash sum part). It is currently disabled on the live instance because: a) 
nobody uses it, b) breaks apt because apache concatenates the Link headers, 
instead of sending them individually, which reaches apt's max header length 
limit.
You may however see them when sending a HEAD request.

As for avoiding the extra round-trip time to the redirector for every 
request, I've the following proposal. I must say I personally would prefer 
not having to do any of this and I consider it rather unecessary.

* APT sends the first request to a $host with the following headers:
X-APT-Arch: $architectures (space separated, for example)
X-APT-Supports: base

Requesting, for instance, /dists/sid/InRelease
* If $host (which might be the redirector) recognises the headers, it may 
reply with:
X-APT-Base-Repository: http://ftp.xx.debian.org/debian /dists/sid/InRelease

Meaning that APT may issue requests directly to ftp.xx.debian.org, as long 
as the architectures it listed in the first request are the only ones 
needed. The URL needs to be split because the http method is agnostic as to 
how the URLs are built, and for replacing the following requests it would 
need to strip everything that matches until the first white space.

* If $host doesn't recognise them, or simply omits any special header, APT 
will no longer send X-APT headers for that host, as long as the http process 
remains alive.


That way, the round trip to the redirector is avoided, but only for the 
duration of the life of http process. There would still be a window of time 
during which the mirror may fail causing the locally-rewritten requests to 
fail, but it would be smaller than when using mirror://.

The redirector could send more than one X-APT-Base-Repository header for APT 
to gather a list of alternative mirrors to choose from. Then further 
requests could be randomly sent to different mirrors.

Etc.

Before going into such kind of extensions, I'd prefer if people actually 
tried the redirector. I personally haven't seen any significant delay.

Cheers,
-- 
Raphael Geissert - Debian Developer
www.debian.org - get.debian.net


Reply to: