[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Request to get Permission for Data extraction



Simon Josefsson <simon@josefsson.org> wrote
Wed, 12 Mar 2025 10:07:08 +0100:

>> As pointed out in another response to your request, it might make sense
>> for you to ask for (a copy of) the metadata kept in the database.
>
> Could the snapshot team make those public?
>
> It is harder than it should be to mirror snapshot locally.  You have to
> screenscrape the web interface to get full data.  This creates
> unnecessary load, so it would be nice if at least the list of filenames
> (essentially SHA1 hashes) could be published.  Right now this
> information is hidden.  As far as I understood earlier discussions on
> this, that hiding is intentional (for reasons I couldn't understand).

Hi Simon,

Do you want to operate a full Snapshot mirror, contributing to the
operations of the Snapshot service? Snapshot has a method for mirroring
the farm described in [mirror/README][]. In addition to that you would
set up postgresql for replication, to keep your db up to date with the
primary.

If not, have you tried accessing the Snapshot database using the
'snapshot-guest' user? The pgsql client would have to make its
connection from a Debian machine allowed to connect to the db (on the
primary or any of the replicas). I don't know how to compile the list of
these machines but DSA surely do.

I don't remember the discussion about hiding information on which files
exist in the farm. What arguments were posed for doing that?

[mirror/README]: https://salsa.debian.org/snapshot-team/snapshot/-/blob/master/mirror/README


Reply to: