[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Processed: destruction of round-robin functionality is fucking up our mirrors and making Debian suck for many people, hence fixing this is a release-critical "wish"



On Mon, Dec 17, 2007 at 12:07:04AM +0100, Josip Rodin wrote:
> FWIW, the last reading is:
> * villa 5.33 MB/s
> * lobos 4.92 MB/s
> * steffani 14.58 MB/s

The original was:
> * villa 4.29 MB/s
> * lobos 3.91 MB/s
> * steffani 14.86 MB/s

Interesting that it got somewhat more balanced.

For reference, the formulas are:

        A     = 6l - 3v
	B     = s - 2l + v
	C + F = 3v - 3l
	D = E = 0

Which gives:

	A     = 13.53 MB/s    (up from 10.59 MB/s)
	B     = 10.07 MB/s    (down from 11.33 MB/s)
	C + F =  1.23 MB/s    (up from 1.14 MB/s)

A is "strict round-robin", so we expect it to increase at the expense
of the other classes the more people use glibc 2.7-4. The increase in
C+F is either indicative of natural variations in source addresses using
security.d.o, or some other mechanism we haven't considered.

> Anyway, in light of all this, please comment again on those old conclusions:

I think I'll leave that 'til the current glibc makes it into testing, and
we see if the change in usage is along the lines of what we'd expect. 

It'd be really helpful if we could get some logs from the above hosts on
what IPs are accessing each host. Just the first byte of the IP address,
and a number of connections (or bandwidth usage) would be enough.

If it were possible, (temporarily) adding a securty.d.o mirror in the
0.0.0.0 - 127.255.255.255 range would be helpful -- that'd give the breakdown
as:

A: 000.000.000.000-127.255.255.255: new-host
B: 128.000.000.000-191.255.255.255: steffani
C: 192.000.000.000-255.255.255.255: villa and/or lobos
D: [hosts not using rule9]:         RR: new-host, steffani, villa, lobos

which would provide a really good cross-check that this is behaving
as we'd expect. In that case, based on the numbers above, the relative
loads would be expected to be:

	new-host:     13.53 MB/s    (up from 0.0 MB/s)
	steffani:     10.07 MB/s    (down from 14.58 MB/s)
	villa+lobos:   1.23 MB/s    (down from 10.25 MB/s combined)

(The above allocates bandwidth due to machines not using rule9 to new-host,
instead of distributing them amongst all four hosts)

Obviously finding a host that can deal with 13.53 MB/s of sustained
traffic with a useful IP address to temporarily test this behaviour
might be difficult. :)

> >   - the understanding of the issue we've got so far implies that this
> >     would only cause fairly minor load balancing problems for the current
> >     Debian hosts
> This disparity doesn't classify as a minor load balancing problem when we
> see one "third" of a rotation doing more than twice as much as other two
> "thirds".

Well, that's expected in some cases; the real refutation is that (assuming
we're not missing some major influence in how security.d.o hosts are
chosen) the distribution of active IPs is decidedly non-uniform --
C+D+E+F above accounts for 25% of the address space, but only 4.95%
of the bandwidth. Compare that to B which also accounts for 25% of the
address space, but accounts for 40.56% of the bandwidth.

As it happens, A, B and C-F correspond to the old class A, B and C
addressing regimes, ie /8, /16 and /24 allocations). Well, F includes the
old multicast and reserved address classes too. Does it make sense that
in today's internet, 55% of the traffic comes from old class A addresses,
40% from old class B addresses and just 5% from old class C addresses?

Comparing that to the xkcd internet map [0], A is the left half, B is the
bottom right, and C is the top right. steffani is in the 128 block marked
"various registrars", just to the bottom right of the center of the map,
and lobos and villa are in the 212 "Europe" block just above and to the
right of the the center of the map.

  [0] http://xkcd.com/195/

> They are functioning now, but the higher the probability that we'll burden
> some sites with excess traffic, [...]

Shouldn't we be doing the same thing OFTC's done anyway, and dynamically
change our DNS responses based on the current capacity of the various
hosts in the rotation? Then we can automatically deal with temporarily
dead hosts, or give hosts a controlled amount of traffic (x% of requests,
rather than just an equal share, eg). I know Ryan (and maybe Jason?) had
been wanting to do this for ages, but it required coding that hadn't
been done.  If OFTC has deployed code that we could just duplicate,
that sounds like the thing to do -- for ftp.d.o as well as ftp.us and
security.d.o, for that matter.

Cheers,
aj

Attachment: signature.asc
Description: Digital signature


Reply to: