[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#884914: apt: does not deal well with high request latency web servers



Hi,

On 12/21/2017 11:02 AM, Philipp Kern wrote:
> At work our packages and package lists are served by a web service that
> has relatively high time to first byte latency. Once it starts
> transmitting the bytes are pushed in a fast way but fetching the bytes
> in the first place takes a "long" time. While we are somewhat fine with
> the latency on "apt update" a lot of packages, it imposes a very high
> penalty on systems that need to fetch a lot of small packages (e.g.
> build dependencies).
> 
> The only solution apt offers today is pipelining. While it allows to
> have a fast start for congestion control purposes, pipelining always
> requires to send the answers in order down the pipe. Unless you set a
> depth that equals the amount of packages to fetch, it will only
> replenish the queue one by one as packages are completed, requiring a
> full RTT to insert a new item. Furthermore it does impact the server
> negatively if you consider the first hop to be a load balancer that fans
> out to other backends. It needs to cache all answers to return them in
> order.
> 
> An easy workaround (few lines changed) would be to just spawn multiple
> transports for a given host target, to make use of multiple connections.
> In this case load balancing the requests onto them speeds up the
> transaction essentially to line speed. There is still the drawback that
> naive load balancing (essentially adding n queues for a host) happens at
> the beginning of the transaction rather than through-out. This is not a
> concern in our particular case, though, as the main issue is to enqueue
> enough requests on the server side.
> 
> It has been raised that this would cause violated assumptions by mirror
> operators though, in case they approximate per-client limits using
> per-connection rate limiting (because bucketing is hard). I'd argue that
> an optional configuration setting that is not enabled by default should
> still be fair to offer.
> 
> Another solution to solve this problem would be to implement HTTP/2
> support, which allows to answer the requests non-linearly. In this case
> a single connection would very likely be enough, as the server can just
> answer what's available and the pipeline will be replenished
> asynchronously. In our case the load balancer would also offer HTTP/2
> server-side[1]. However I'd argue that such an implementation should
> then not be hand-rolled like the HTTP(S) transport and would require
> depending on another library like nghttp2. So it would likely need to
> live in its own apt transport.
> 
> Happy to hear your thoughts on how to solve this. And please keep up the
> great work. :)
> 
> Kind regards and thanks
> Philipp Kern
> 
> [1] Note that HTTP/2 makes encryption mandatory.

gentle post-vacational ping. I'm happy to look at whatever is needed
here, however I'd prefer to work towards a consensus. ;-)

Kind regards and thanks
Philipp Kern

Attachment: signature.asc
Description: OpenPGP digital signature


Reply to: