[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Bug#1032587: UDD's upstream_metadata table may contain stale data?



Am Tue, Mar 14, 2023 at 10:05:33PM +0100 schrieb Andreas Tille:
> Am Tue, Mar 14, 2023 at 10:42:30PM +0200 schrieb Faidon Liambotis:
> > Thanks Andreas! Is the code and/or logs for this cronjob somewhere I can
> > access myself? Perhaps I could have a look myself and help you out?
> 
> Its
> 
>    https://salsa.debian.org/blends-team/website/-/blob/master/misc/machine_readable/fetch-machine-readable_salsa.py
> 
> but I think this short term issue is not worth that you are looking into
> it.  It should run on blends.debian.net since there are most of the
> watched projects cached.  For some reason the job seems to fail for
> 
>    https://salsa.debian.org/python-team/packages/kazam
> 
> but I need to sort out whether this suspicion is true

Suspicion is wrong and I'm now convinced that we are trapped by some
means by Salsa to prevent DOS attacks.  That's why I increased the time
span between fetching data from Salsa (we have time, just need to be
ready in less than one day) and added more exceptions[1] to hopefully
fetch things like these:

>     raise RemoteDisconnected("Remote end closed connection without"
> http.client.RemoteDisconnected: Remote end closed connection without response
> ...
>     raise RemoteDisconnected("Remote end closed connection without"
> urllib3.exceptions.ProtocolError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
> ...
>     raise ConnectionError(err, request=request)
> requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

It seems I should also make sure cron will sent me some mail on
failure of this job since it was not working for quite some time.

BTW, it might be that I could need help by convincing Salsa admins that
it makes sense to parse these machine readable files right on Salsa.  I
started in times of Alioth with a job that was reading repositories
directly which was way less network consuming.  Since Salsa this is
not possible any more.  I tried really hard to cache the results and
reduce the network traffic as low as possible (just downloading single
files only).  However, as we see this is not reliable any more (which
it was for a couple of years).

Kind regards
   Andreas.

[1] https://salsa.debian.org/blends-team/website/-/commit/a51cf1cfeadc4693aaacfb5e74113805a286ebe1

-- 
http://fam-tille.de


Reply to: