[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#884914: apt: does not deal well with high request latency web servers



On Sun, Jan 14, 2018 at 06:06:28PM +0100, Philipp Kern wrote:
> That said, you just went the other way. Not that I liked the cURL https
> transport, given that it was fairly stupid^Wstraightforward. But at
                                              ^^^ you must be joking…¹
> least it used a library. So I guess I'm curious if you'd be open to
> picking a library and then rewriting everything to use that single one
> for all of HTTP(S) 1.1/2. (Assuming such a library exists by now.)

Well, curl wasn't a good choice for implementing https for us nowadays
as it is big library supporting lots of things we don't want (apt
downloads over IMAPS after a http redirect, Transport-Encoding: gzip, …)
and not supporting many things we do (SRV records, working with hashes
…).  Many improvements to https were only possible by sidestepping curl
– that isn't how you want to interact with a library. If there is
a library which would give us what we ended up with by combining our
http with a TLS wrapping we could change to that (assuming its not too
obscure, linux-only, … and so on).

There is still a lot of code to be written for the actual implementation
of the transport as we want to be in control of a lot of things:
Redirects e.g. can't be handled internally by the library: We like to
limit what is possible (e.g. no https→http redirects) and want to keep
an instance single-connection: a transport can't change the server it
talks to at runtime as it has no access to the configuration applying
potentially for this server (auth, settings, …).  Beside, it isn't very
realistic that the library does play as much with hashes as we do from
terminating the connection if the content-length didn't match what we
expect to fixing pipelining problems at runtime simply because that
isn't how the rest of the internet works…

Of course, that sounds like NIH, but on the other hand we didn't want to
implement TLS by hand nor am I particular interested in hand-rolling the
binary parts of http2 if it can be helped, so from my viewpoint its not
NIH but "just" the guesstimate that there will be no off-the-shelf
solution. A transport is "just" a lot more than a wget wrapped in
C code.

¹ yes, I am allowed to "bash" curl as I am due to a very minor
  contribution in relation to SOCKS² error reporting considered
  a contributor as I recently found out. So apt leaving curl had
  at least some advantage for both sides I guess. ;)

² something I thought I wouldn't implement myself either, so
  perhaps there is some hope for me doing http2 NIH-style still ;)

> > [HTTP/2 has an unencrypted variant aka "h2c" btw. At least on paper
> > – I know that browsers aren't going to implement it.]

[That was just a comment on "requires TLS" – on paper it doesn't. I am well
aware that this might never really exist in practice and I am not concerned
either way.]


> >> Happy to hear your thoughts on how to solve this.
> > You could do something similar to what httpredir.d.o was doing or work
> > with the (now reimplemented) -mirror method. That hard-depends on your
> > "behind load-balancer" servers to be reachable directly through. But it
> > gets you parallel connections without changing clients (at least in the
> > first case, in the second you will need to wait a few years until this
> > becomes true I guess).
> 
> That's actually an excellent idea. We'll try that out. I actually wrote
> code for it already and it's a pretty straightforward approach, but
> we'll take another stab at attempting to reduce time to first byte
> latency first.

Julian commented on IRC a while ago that I wasn't explaining the second
option – using the mirror method – very well. So for the record:
Assuming you have 3 mirrors and a load-balancer you could place
a mirrorlist file containing the 3 mirrors on the balancer, point your
clients to it and be done. In pre-1.6 versions apt will consistently
choose one of the mirrors [based on the hostname of the client] – so you
get what you have at the moment more or less. With 1.6 onwards apt will
download each file from a randomly picked server³, so you will end up
with 3 parallel downloads in practice [expect if you are randomly
unlucky or if you run 'update' as that will stick to the same mirror by
default to avoid sync problems]. See the apt-transport-mirror manpage
(in 1.6) for advanced details.

³ more exact: The list of three mirrors will be shuffled randomly for
each file and then the download is attempt in this order, so it isn't
exactly load-balanced, but close enough for now.



Anyhow: What are we going to do with this bugreport now? Close, retitle
for http2 support or is there some other action we can take to fix the
initial problem?


Best regards

David Kalnischkies

Attachment: signature.asc
Description: PGP signature


Reply to: