On Fri, 2013-09-27 at 14:30 -0400, James Bromberger wrote: > Couple of questions (while I am waiting to come home) - the underlying > data set - is it a massive symlink farm for the files? The underlying data is a hash-addressed (SHA-1) filesystem overlayed on a conventional filesystem (ext3/etc). The first two characters of the hash are used to create two sublevels of directories to avoid filesystem directory limitations. So an empty file would be stored here: /srv/snapshot.debian.org/data/da/39/da39a3ee5e6b4b0d3255bfef95601890afd80709 The filenames and other metadata are stored in the postgresql database. > Whats the current growth rate? Some information about that is here (dsa-guest, no password): https://munin.debian.org/debian.org/sibelius.debian.org/df_abs.html > Are there any individual files greater than 5 TB? I think the largest file ever in the Debian archive has probably been 1-2GB only, I doubt we will ever get files of that size. > Where is the data currently? On stabile (hardware issues) and sibelius: https://db.debian.org/machines.cgi?host=sibelius https://db.debian.org/machines.cgi?host=stabile > Can we get it loaded onto a set of HDDs for shipping into AWS, or > would you want to sync that all online over a period? I am not part of the Debian sysadmin team but I guess that could be a possibility. I think this was done with the initial two-machine setup. > Just want to work out what would be required. I see the costs would be > around US$1600/month for this (hosted in the US) - around US$20k/year, > so I just need to convince the company of this and get approval. Wow, I didn't expect it would cost that much. > Would we be able to get an AWS logo in the footer, and acknowledgment on > the front page, etc? I need to build a justification for something this big. The sponsors of the current system are acknowledged on the front page of the site and in the machines list (linked above). I expect if Amazon were to sponsor a replica we would add them there too. I'm not sure about a logo in the footer though, that would be up to folks who run it. http://snapshot.debian.org/ In terms of justification, snapshot.d.o is an essential tool for developers (bug bisection etc) and users (testing etc). In addition it is essential for two Debian projects; reproducible builds and the Debian derivatives patch generation. https://wiki.debian.org/ReproducibleBuilds http://dex.alioth.debian.org/census/patches/ -- bye, pabs http://wiki.debian.org/PaulWise
Attachment:
signature.asc
Description: This is a digitally signed message part