[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#973861: apt: http acquire method still failing with "Undetermined error" or "Data left in the buffer"



On Thu, Oct 06, 2022 at 02:11:49PM +0200, Alexander Thomas wrote:
> We are also regularly bumping into this issue.
> Part of our build process gathers changed Debian packages w.r.t. a
> reference environment, such that the packages can be updated in an
> offline environment. The packages are gathered by means of a
> `dist-upgrade --download-only` invocation of apt-get. This can easily
> result in hundreds of packages being downloaded in one operation.
> 
> Our local APT mirror server is served through nginx/1.18.0.

Have you tried to switch it to Apache?

> The Undetermined Error is obviously caused by the `keepalive_requests`
> setting of Nginx. It defaulted to 100, hence we saw download failures
> on Gets with indices that were multiples of 100 (not consistently,
> only occasionally). Then we bumped keepalive_requests to 1000, and the
> randomly failing builds went away, until we had to handle a
> distribution upgrade in our builds, which resulted in more than 1000
> packages to be downloaded in a single operation:
> 
> Get:999 http://our.local.mirrror/jammy-merged-20220816 jammy/main
> amd64 liblapack3 amd64 3.10.0-2ubuntu1 [2504 kB]
> Get:1000 http://our.local.mirrror/jammy-merged-20220816 jammy/main
> amd64 liblbfgsb0 amd64 3.0+dfsg.3-10 [29.9 kB]
> Err:1000 http://our.local.mirrror/jammy-merged-20220816 jammy/main
> amd64 liblbfgsb0 amd64 3.0+dfsg.3-10
>   Undetermined Error [IP: 192.168.123.45 80]
> 
> The APT config has `Acquire::Retries "10";` but this does not seem to
> help with this particular failure.
> So, now we have further bumped keepalive_requests to 2000, but this
> feels like a game of whack-a-mole.

I would happily accept patches to make this work in development
releases, obviously this is not something that we can fix or analyse
on our side as the bugs depend on your specific network timing.

Instead of playing whackamole, you can presumably also set
    Acquire::Max-Pipeline-Depth to 0
or
    Acquire::http::Pipeline-Depth to 0
to disable pipelining.

I could also just disable pipelining for nginx.

But the problem is that apt is far too slow without pipelining
enabled, so it needs to be on by default.

Actually, browsers use up to 6 parallel connections today. I'd
prefer doing this too. My suggestion is to use min(ceil(N/6), 6)
queues for N items, so <=6 items -> 1 queue, 7-12 -> 2q, 13-18 => 3q,
etc.

This is configurable. If you set a maximum of 3 connections
per servers, you need at least 7 items (1-3, 4-6, 7-9). For
9 connections it's going to be 9*(9-1)+1=73 items to use all
queues.

But I don't know how the browsers scale this tbh.

This will also enable us to drop all our custom HTTP code
and switch to libcurl with curl_multi, gaining HTTP/2 and HTTP/3
support for deb.debian.org and HTTPS mirrors.

-- 
debian developer - deb.li/jak | jak-linux.org - free software dev
ubuntu core developer                              i speak de, en


Reply to: