Re: Debian archive distribution via CloudFront CDN

On 11/05/2013 4:52 PM, Anders Ingemann wrote:
This is awesome! Great work!
Does pulling data from CF via an EC2 instance incur traffic costs?

Great question - and I promise this is not a set-up! Yes, but Amazon's got it covered.

However, the charging pattern for this would be:

1) for the traffic from the Edge locations to the end customer, based upon where that Edge is in the world
  - volume charge
  - charge per request against CloudFront - a few cents per 10,000 HTTP requests. Details are here http://aws.amazon.com/cloudfront/pricing/

2) for the EC2 instance
  - volume charge for the egress traffic from (to the Edge locations; but then that's cached for a (configurable) period, and its only for specific files within /debian/dists/*, which is relatively small)
  - per-hour charge for the EC2 host; this can be optimized with Reserved Instance pricing

If you find any files being stale through CloudFront please let me know. The default cache time for everything in /debian/dists/ is currently 5 minutes; I may increase this slowly to 15. Overriding this are the following LocationMatch lines:

60 seconds for: ^/debian/dists/.*\.diff/(Index)?
60 seconds for: ^/debian/dists/.*/i18n/(Index|Translation-.*)?
60 seconds for: ^/debian/dists/.*/(binary-.*|source)/(Packages(\..*)?|Sources(\..*)?|Release)?

So in the case of the files http://cloudfront.debian.net/debian/dists/testing/main/binary-amd64/Packages.diff/:

* The directory index should match and be refreshed after 60 seconds
* The "Index" file should match and be refreshed after 60 seconds
* The individual time-stamped files falls to the default 5 minutes for everything in /debian/dists/ (but these could be indefinitely cached in theory)

The default cache expiry time for paths in /debian/project/ is 60 seconds. And lastly, for the main pools - 24 hours. I'll continue to tune this There's 40 Edge locations currently live for CloudFront globally.


