[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Debian snapshot.debian.org replica on the Amazon cloud?

Hash: SHA1
On 19/11/2013 6:23 PM, Peter Palfrader wrote:
> Seems part of that request was forgotten - while we had installed s3cmd since mid October, we didn't export all the Debian accounts to sibelius yes. Fixed that now. Cheers,

Thanks Peter,

All looking good. I have configured my user account on their with credentials to talk to S3; I have three threads currently transferring from the 00/ folder. S3 is not browseable (no auto index), but given the Postgres database knows the file names, you can start seeing files in S3 now, eg:


If we get a DNS entry into debian.org that is a CNAME to S3, then we can use:


I've set a 4 day 'archive to Glacier' policy on this bucket. All objects we transfer in will go from the standard S3 (live) to archive storage after 4 days of being ingested. This is tunable; but we want older files (less likely to be recalled) to be in the cheaper tier of storage. On demand we can initiate a pullback from Archive of those files (3-5 hours to complete) - which brings back a copy of the file into "live S3" (using the Reduced Redundancy tier of storage for this duplicate live copy) for a number of days.

So here's my thoughts on a few more steps in our experimentation here:
* The initial sync will take some days. I don't want to flatten anyone's (Welcome's) uplink with this, so I'll keep it at a low number of threads.
* While this first sync is happening, we can look at the Postgres database. AWS announced last week that the managed database environment, RDS (Relational Database Service) now supports Postgres as the 4th engine - with PG 9.3.1. I have brought up one of these instances in US-East ready for this database (snapshot-prod.cjaijq7ayn5u.us-east-1.rds.amazonaws.com:5432). The security group is currently closed to the outside world - which host is holding the front-end web app that uses this database? I'm happy to give the master creentials to someone if they want ot squire the PG database into here? I've configured it for mult-AZ (replication to an alternate datacentre) and a daily snapshot of the data for a 21 day moving window. This could go live as the database for this app now, with the farm of files still where they are now.

Thoughts? Is this OK for people? I don't want to be treading on anyone's toes here. Happy to give anyone credentials to push into this S3 bucket - just ping me.


- --
/Mobile:/ +61 422 166 708, /Email:/ james_AT_rcpt.to
Version: GnuPG v2.0.17 (MingW32)

Reply to: