[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Debian snapshot.debian.org replica on the Amazon cloud?



On Fri, 2013-09-27 at 14:30 -0400, James Bromberger wrote:

> Couple of questions (while I am waiting to come home) - the underlying 
> data set - is it a massive symlink farm for the files?

The underlying data is a hash-addressed (SHA-1) filesystem overlayed on
a conventional filesystem (ext3/etc). The first two characters of the
hash are used to create two sublevels of directories to avoid filesystem
directory limitations. So an empty file would be stored here:

/srv/snapshot.debian.org/data/da/39/da39a3ee5e6b4b0d3255bfef95601890afd80709

The filenames and other metadata are stored in the postgresql database.

> Whats the current growth rate?

Some information about that is here (dsa-guest, no password):

https://munin.debian.org/debian.org/sibelius.debian.org/df_abs.html

> Are there any individual files greater than 5 TB?

I think the largest file ever in the Debian archive has probably been
1-2GB only, I doubt we will ever get files of that size.

> Where is the data currently?

On stabile (hardware issues) and sibelius:

https://db.debian.org/machines.cgi?host=sibelius
https://db.debian.org/machines.cgi?host=stabile

> Can we get it loaded onto a set of HDDs for shipping into AWS, or
> would you want to sync that all online over a period?

I am not part of the Debian sysadmin team but I guess that could be a
possibility. I think this was done with the initial two-machine setup.

> Just want to work out what would be required. I see the costs would be 
> around US$1600/month for this (hosted in the US) - around US$20k/year, 
> so I just need to convince the company of this and get approval.

Wow, I didn't expect it would cost that much.

> Would we be able to get an AWS logo in the footer, and acknowledgment on 
> the front page, etc? I need to build a justification for something this big.

The sponsors of the current system are acknowledged on the front page of
the site and in the machines list (linked above). I expect if Amazon
were to sponsor a replica we would add them there too. I'm not sure
about a logo in the footer though, that would be up to folks who run it.

http://snapshot.debian.org/

In terms of justification, snapshot.d.o is an essential tool for
developers (bug bisection etc) and users (testing etc). In addition it
is essential for two Debian projects; reproducible builds and the Debian
derivatives patch generation.

https://wiki.debian.org/ReproducibleBuilds
http://dex.alioth.debian.org/census/patches/

-- 
bye,
pabs

http://wiki.debian.org/PaulWise

Attachment: signature.asc
Description: This is a digitally signed message part


Reply to: