Hi, as the author of metasnap, debrebuild and debbisect (all using snapshot.d.o) I thought that maybe it makes sense to post to these bugs what I learned about how to reliably pull data from snapshot.d.o. For the most reliable and stable solution (has been able to download stuff for over three months without unrecoverable errors) I recommend to look at the function download() in the metasnap source code: https://salsa.debian.org/josch/metasnap/-/blob/master/run.py#L129 To run stable the code: - throttles the number of requests such that each request takes a minimum of 1.5 seconds which allows about 2200 connections per hour - is able to recover from infinite timeouts - is able to handle HTTP errors 500 - recovers after E_COULDNT_CONNECT errors - backs off for four seconds (less do not work) to the power of number of retries The second-best solution is implemented by debbisect. Since the same packages are required multiple times, it builds a local cache of the packages and then apt only sees that cache. The caching code borrows some of the things I learned from writing metasnap and seems to run reasonably well. See the _download_new() function in the Proxy class: https://salsa.debian.org/debian/devscripts/-/blob/master/scripts/debbisect#L159 Thirdly, debrebuild only needs packages once, so a cache doesn't make sense. Instead we try to do the best we can just using apt options. The following options seem to work okay. If it still fails, retrying works most of the time. Acquire::Check-Valid-Until "false"; Acquire::http::Dl-Limit "1000"; Acquire::https::Dl-Limit "1000"; Acquire::Retries "5"; https://salsa.debian.org/debian/devscripts/-/blob/master/scripts/debrebuild Maybe my next project will be an overlay for snapshot.d.o which provides a sane rate limiting and throttles the bandwidth instead of cutting connections, refusing new connections or throwing HTTP 500 errors... Thanks! cheers, josch
Attachment:
signature.asc
Description: signature