[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Processed: destruction of round-robin functionality is fucking up our mirrors and making Debian suck for many people, hence fixing this is a release-critical "wish"



(Please Cc: any responses.)

On Mon, Dec 17, 2007 at 03:10:24PM +1000, Anthony Towns wrote:
> Interesting that it got somewhat more balanced.

It looks like an effect of the weekend ending - more machines in the
respective netblocks waking up? I checked again a few moments ago,
and last day's statistic shows that steffani is getting some 55% of traffic.

> It'd be really helpful if we could get some logs from the above hosts on
> what IPs are accessing each host. Just the first byte of the IP address,
> and a number of connections (or bandwidth usage) would be enough.

I've asked DSA for server-status already, and mentioned the logs too,
we'll see (they haven't replied yet).

> If it were possible, (temporarily) adding a securty.d.o mirror in the
> 0.0.0.0 - 127.255.255.255 range would be helpful [...]
> Obviously finding a host that can deal with 13.53 MB/s of sustained
> traffic with a useful IP address to temporarily test this behaviour
> might be difficult. :)

Quite.

> > >   - the understanding of the issue we've got so far implies that this
> > >     would only cause fairly minor load balancing problems for the current
> > >     Debian hosts
> > This disparity doesn't classify as a minor load balancing problem when we
> > see one "third" of a rotation doing more than twice as much as other two
> > "thirds".
> 
> Well, that's expected in some cases; the real refutation is that (assuming
> we're not missing some major influence in how security.d.o hosts are
> chosen) the distribution of active IPs is decidedly non-uniform --
> C+D+E+F above accounts for 25% of the address space, but only 4.95%
> of the bandwidth. Compare that to B which also accounts for 25% of the
> address space, but accounts for 40.56% of the bandwidth.

I don't think anyone was ever seriously expecting these distributions to
match - applying Hanlon's razor, I suspect that nobody ever simply stopped
to consider the issue fully.

So real-world deployment served as the ground for regression testing... <sigh>

> > They are functioning now, but the higher the probability that we'll burden
> > some sites with excess traffic, [...]
> 
> Shouldn't we [...] dynamically change our DNS responses based on the
> current capacity of the various hosts in the rotation?

Yes, but having our hand forced by a wacky glibc change doesn't help much
in the way of motivating people to do better things. Which brings up the
question - what's the guarantee that someone won't change the resolver
to undermine whichever new scheme we come up with? :)

> If OFTC has deployed code that we could just duplicate, that sounds like
> the thing to do -- for ftp.d.o as well as ftp.us and security.d.o, for
> that matter.

We actually have a weighted round-robin in place for ftp.jp; OFTC's
load-balancing is based on WHOIS country data, IIRC; we could benefit from
both features. But that's off-topic here...

-- 
     2. That which causes joy or happiness.


Reply to: