[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: help wanted, standing up mirroring sync proxies on public cloud



Hi Julien

On Thu, Mar 17, 2022 at 10:31:33PM +0100, Julien Cristau wrote:
> > You are just talking about the authenticated rsync and push stuff right
> > now?  Because mirror-isc.d.o for example does more.
> I figured we'd start there, yes.  Moving static mirrors around seems a lot
> easier.

You are right.  Also it's something unique to Debian.

> > Do you intend to make the syncproxy setup a bit more failover friendly?
> > So you can kill one and make another take up the work.
> I'm not sure.  Some of that is a bit constrained by things like
> downstream firewalls.  I'd be interested though if you have suggestions
> of things we could do.

I think an elegant solution would be something like:
- split ssh callout from data storage
- use load balancing over the nearest backend (this however is pretty
  special stuff)
- allow clients to connect to every data storage with the allowed
  archive type (main, security, debug, ports)

We would have the following components:
- a (maybe pair) of ssh callout instances.  all clients would get ssh
  connections from these IP.  all data storage systems as well.
- a bunch of data storage instances spaced out as needed.  they can
  contain different sets of archives or just only one for each and more
  instances
- some proximity load balancing front (AWS and Azure can do that via
  DNS, AWS and GCE via global anycast load balancers)
- freshness checks on the data storages.  this will allow storages to
  remove themselves from the load balancers after missing updates for
  too long

Advantages:
- ssh connections will always come from the same set of IP, regardless
  where to the client will connect for rsync
- the load balancing will allow to shift traffic between backends
- we can still do manual traffic steering

Disadvantages:
- clients can't longer firewall outgoing connections to the storages,
  depending on the actual implementation
- the ssh callout instance need to orchestrate and wait for all data
  storage nodes to be in sync before calling clients
- only works within one cloud

Regards,
Bastian

-- 
Is truth not truth for all?
		-- Natira, "For the World is Hollow and I have Touched
		   the Sky", stardate 5476.4.


Reply to: