Bug#1020217: snapshot.debian.org: write a generic file driver supporting multiple backend (such as object-storage)
On 21/09/23 at 18:26 +0200, Lucas Nussbaum wrote:
> Hi Baptiste,
>
> On 18/09/22 at 11:29 +0200, Baptiste Beauplat wrote:
> > Having files stored into an object-like storage could improve cost and
> > performence over the conventional storage that snapshot is currently
> > using. (Currently using over 130 TB!)
> >
> > Multiple componants needs to access snapshot farm. The importer, the web
> > app (if not redirected) and multiple other scripts.
> >
> > We should write a generic file driver to allow all those component to
> > access/update/delete file from a config-defined backend.
> >
> > This driver would be usable in at least two langauges: ruby and python.
> > I'm not sure what is the best course of action here. Some kind of
> > bindings or maintaining two separate drivers.
> >
> > Note that there is also some C program as part for snapshot (the fsck
> > program).
> >
> > I was thinking for writing at least two backend for the driver:
> >
> > - a standard flat filesystem storage (what we have currently)
> > - an object-like storage. S3 would be a good candidate since a couple of
> > opensource storage solution provide S3 compatible API.
> >
> > That would allow a two step transision: start using the driver, then
> > switch the backend.
>
> I was wondering if you made some progress on this?
>
> Your plan looks very good. I agree that a S3 backend would make a lot of
> sense (usable both with self-hosted solutions like MinIO, or with
> managed services).
>
> Let me know if I can help somehow.
Looking at the code:
The Ruby part (in charge of importing data) already has an abstraction
layer:
https://salsa.debian.org/snapshot-team/snapshot/-/blob/master/snapshot#L59
The Python part (in charge of the web app) doesn't:
https://salsa.debian.org/snapshot-team/snapshot/-/blob/master/web/app/snapshot/controllers/archive.py#L192
Lucas
Reply to: