[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Planning for a mirror using Google Cloud CDN



Hi folks

Another follow up on this matter.

I setup a test environment in GCE, which can be used to do reliability 
and throughput tests.  It consists of two n1-standard-2 instances in the
europe-west1 region for the main and two for the security archive.  Each
instance should be able to sustain 180MB/s reading from disk and 4GBit/s
network egress to the load balancer.

It can be reached via
http://gce-mirror.const-cast.de/debian
http://gce-mirror.const-cast.de/debian-security

At the same time I did a similar test setup on AWS.  This is meant as a
possible replacement for klecker.d.o as backend for Cloudfront.  It uses
m4.xlarge instances in the eu-central-1 region.  Each should be able to 
provide 1GBit/s network egress to the load balancer. 

It can be reached via
http://aws-mirror.const-cast.de/debian
http://aws-mirror.const-cast.de/debian-security

Both environments are not pushed, but updated four times a day.

I would be glad if you could give them a spin and report back unexpected
problems.

On Thu, Feb 16, 2017 at 09:57:34AM +0100, Bastian Blank wrote:
> On Sun, Nov 20, 2016 at 03:47:33PM +0100, Bastian Blank wrote:
> > We got the ok from Google to use their Cloud CDN as a public mirror.
> > There is one technical limitation in the implementation left, which
> > needs to be fixed first, but I'm confident they will be able to do that.
> Sadly there are no news on this problem.  I'll keep in touch with Google
> about it.

The limitation was relaxed a bit and Google Cloud CDN can now cache
files up to 10MB in size.  So all the metadata fits, but not for example
kernel packages.

> I played with our sync infrastructure a bit and found out that we can
> make the synchonization easier by one small change to the archive:
> create by-hash hierarchies for older distributions.  We already got the
> go ahead from the relevant teams and just wait for ftp-master to
> implement this change.  For now I'm down to about 1 minute of
> synchronization lag for the main archive between locations.  I'll
> consider that acceptable.

This change was implemented in dak.  I now need to do some modifications
to ftpsync to make better use of it.

> The setup is pretty automated right now.  GCE can't create instances
> with extra disks or from snapshots (for groups), so the system setup
> needs an additional step of adding the data disk, which I do with
> Ansible.

I decided to model this environment with Terraform.[1][2]  Terraform is
by far good software; it feels like ancient Puppet and neither does
provide control structures, nor can it properly infer on real state.
Also given it's state as hipster software written in Go and using
hundred of dependencies, it is not in Debian.  However it supports
almost anything we need for this setup.

I would have liked to use Puppet for that, but the modules for such
cloud stuff are almost all outdated and unmaintained.

> I intend to integrate the mirrors into the debian mirror team managed
> set of mirrors as the next step.

Ansible and dynamic inventories work really well and it can drive this
cloud mirrors without much problems.

Regards,
Bastian

[1]: https://www.terraform.io/
[2]: https://gitlab.com/waldi/debian-mirror-cloud-terraform
-- 
	"Beauty is transitory."
	"Beauty survives."
		-- Spock and Kirk, "That Which Survives", stardate unknown


Reply to: