Bug#1020217: snapshot.debian.org: write a generic file driver supporting multiple backend (such as object-storage)
Hi there!
I'm hereby cc-ing our DPL, to get him involved an eventual storage 
cluster purchase for the project.
I have been mentioning such an object storage driver, so we could use 
OpenStack swift for snapshot.d.o for years. I am happy that it finally 
brings traction, and that Lucas is implementing this. Thanks Lucas.
However, it is disappointing to see it moving toward an s3 
implementation, which is a protocol from a closed-source service. I 
already wrote multiple times that my company (Infomaniak) was willing to 
sponsor storage space on Swift for it.
FYI, we currently manage more than 110 PB of storage over 7500+ HDD and 
growing, so I am not scared at all about storage space. Some clusters we 
manage are around 40PB, with billions of files.
Though I do not envision *any* sponsor to provide the storage space, but 
rather, Debian maintaining its own storage cluster. To give you a rough 
idea of what this would represent, let me give you some idea of what 
type of hardware involved, and it pricing.
I would currently recommend this type of 2U server:
https://www.aicipc.com/en/productdetail/51224
They provide 24 HDD storage, plus 2x SSD for the system. Equipped with a 
decent amount of RAM (128 GB) and a CPU, the cost is around 4000 EUR per 
server without the HDDs. Currently, 22TB Seagate HDDs are at around 350 
EUR per piece. So such a server fully equipped with HDD would be at 
around 12000 EUR per server. If we want 6 of them (which is IMO the bare 
minimum for redundancy, as each file is stored 3 times), we're talking 
of around 75000 EUR, plus 3 smaller servers to act as auth server (ie: 
Keytsone), at let's say 4000 each (which is average price for a decent 
server with 128 GB of RAM and 2x SSD system, plus 32 cores CPU), we 
would end up spending around 90kEUR for such a storage cluster. This 
would provide 1 PB of redundant (ie: copied 3 times) storage space.
This would need 15U of rack space, plus an eventual switch.
Though if we want to be safe, we could purchase at least one spare node 
and a few HDDs.
So all together, we're looking at a 100kEUR spending. Note that this 
type of swift cluster could also be used for artifact storage for Salsa 
(gitlab has a swift backend storage driver).
Also note that we're currently (at Infomaniak) using these AIC chassis 
with amd64, but we're looking at replacing the boards with some Gigabyte 
motherboard using Ampere CPU (ie: ARM64 based, with 80 cores).
If we need to save on costs at first, we could lower the amount of HDDs 
(let's say half), and add more HDDs later on. But you got my point, it's 
not *that* expensive, and for sure, something we could afford (we do 
have the budget).
I am hereby volunteering to setup such an OpenStack swift cluster for 
snapshot.d.o, or other Debian use. It'd be easy to find other people 
interested in helping me maintain this (I know some persons that already 
volunteered to help me when I'm away, in holidays or otherwise).
Your thoughts? Would the DPL agree on such a spending? Do we have 
somewhere to host this? At UBC? What would be the DSA opinion about 
this? Would they get involved? (IMO, we can do without DSA if they don't 
want to get involved, but I'd prefer if they would...)
Cheers,
Thomas Goirand (zigo)
P.S: Please CC me.
Reply to: