Bug#1020217: snapshot.debian.org: write a generic file driver supporting multiple backend (such as object-storage)
- To: 1020217@bugs.debian.org
- Subject: Bug#1020217: snapshot.debian.org: write a generic file driver supporting multiple backend (such as object-storage)
- From: Lucas Nussbaum <lucas@debian.org>
- Date: Wed, 14 Aug 2024 14:39:48 +0200
- Message-id: <[🔎] ZryllBhIIJRU55ad@grub.nussbaum.fr>
- Reply-to: Lucas Nussbaum <lucas@debian.org>, 1020217@bugs.debian.org
- In-reply-to: <Zmbj9gWzKgRteV0Y@grub.nussbaum.fr>
- References: <20220918092943.5nhmlubq7mnif3z2@libra.beauplat.fr> <b79e8f99-ce73-4edb-a653-76948db2ac99@debian.org> <20220918092943.5nhmlubq7mnif3z2@libra.beauplat.fr> <Zmbj9gWzKgRteV0Y@grub.nussbaum.fr> <20220918092943.5nhmlubq7mnif3z2@libra.beauplat.fr>
On 10/06/24 at 13:31 +0200, Lucas Nussbaum wrote:
> It looks like you see the work on object-storage backend as a
> procurement/infrastructure question. I don't think that this is the main
> issue. Based on what I've done so far (and I still plan to continue
> working on this, but I have limited time for Debian nowadays), the code
> also needs deep changes because, if you want an object-storage-based
> backend to perform adequately, you need to more parallelism for
> backend-related operations.
>
> This is true whether the storage service is AWS S3, OpenStack Swift,
> Azure Blob Storage, or Ceph Object Storage, or whatever. If you increase
> the latency between the importer/indexer and the storage service,
> you need parallelism to hide it and stay with a bandwidth-bound problem.
>
> To work on this, you need an object storage backend, but I suspect that
> once it works with one of them, porting it to another one will be
> trivial, as the S3-specific bits are really minimal. (and Swift is
> S3-compatible anyway)
>
> Help is welcomed -- my code is at
> https://salsa.debian.org/lucas/snapshot/-/commits/s3snap/?ref_type=heads
>
> Typically a good way to test this is to try to import a small archive
> (e.g. debian-security with one architecture only) and see if you can get
> an import time on object storage that is similar to the one on
> file-based storage.
FYI, I stopped working on that, since the FS-backed service is back in a
good state. My work is pushed to the above git repo and I cleaned up the
infrastructure bits I set up on AWS.
Lucas
Reply to: