[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Snapshot behind Fastly; roles and responsibilities



Hi,

Philipp Kern <pkern@debian.org> wrote
Mon, 18 Nov 2024 15:29:22 +0100:

> Hi,
>
> On 11/18/24 11:38 AM, MOESSBAUER, Felix wrote:
>> On Mon, 2024-11-18 at 10:35 +0100, Linus Nordberg wrote:
>>> Hi all,
>>>
>>> Snapshot is behind Fastly since Sunday Nov 17 2024. I think that's
>>> bad
>>> and would like to change that. It's bad in the short term since we
>>> expose user data to a third party. It's bad in the long term since
>>> the
>>> short term bad won't go away until we learn how to deal with web
>>> traffic.
>> 
>> That's a trade off between the advantages of a CDN and privacy.
>> For me as snapshot user that needs it to build reproducible things in
>> CI systems, the most important aspect is reliability and performance.
>
> That's also how I see it. We need a way to ban entire ASes from Debian
> infrastructure, as long as they keep sending abusive requests from a
> very large amount of IP addresses.
>
> While I think we should make sure that we can keep up with a high amount
> of requests (which probably requires pgbouncer and some other fixes),
> serving the ridiculous amount of scraping sent by Tencent without
> coordination or backoff is not helpful.
>
> I hacked together something to collect data from BGP and I could go and
> put stuff into an ipset to block on - but Fastly made that ridiculously
> easy. And Tencent shot at snapshot-master (sallinen) the day before,
> which was easy to shield off.
>
> Note that a lot of traffic to snapshot is HTTP - and you are traversing
> the world to get to the target host - and thus the privacy bits are
> already very low. We are also not serving user data here, only known
> public bits.

I think that users connecting to https://snapshot.debian.org/ should
expect the information about what packages they are interested in to
stay between them and Debian. Can we have Fastly redirect 443 to our
servers? It would allow us to apply our little crummier rate limiting
for HTTPS only.

Orthogonal to that why not make the opting out of privacy more explicit
by using a DNS name that indicates the destination of the traffic, f.ex.
fastly-cdn.snapshot.debian.org.


[lots of good stuff removed, i'll have to return to this separately]


Reply to: