[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: apt: http vs. ftp?



On Thu, 7 Dec 2000, David Teague wrote:

> I never pretended to know anything, but I find your response
> amusing, and your discussion and that of others enlightening. I
> suspected that ftp might not be faster, but did not know why.

The true answer is that there is no way one is faster than the other for
infinite sized files. They both use TCP, the both use the same kernel and
they both use the same TCP options. Any percived difference is mearly
statistical error. In practice this is a bit of a lie, HTTP severs tend to
be more optimized and use thing like sendfile and pay attention to
latency. That kernel http server (TUX?) will likely lay to waste any FTP
server when you start talking about real servers with real load.

Now you can start talking about non-infinite file sizes and lots of them. 
At this point FTP starts spending less time transfering data and more time
waiting around to decide what to do. In contrast, HTTP -never- stops, the
socket buffers remain full through out the etire session and it never
sends partial packets.
 
This is because APTs HTTP implemenation is fully pipelined and optimized
for this. FTP cannot be. You can see this by running apt-get update twice
with http and ftp. The second run when it prints out 'Hit:' is almost
instant for HTTP, while it is agonizing for FTP.

A typical FTP request may look like this:

c: SIZE /foo
s: 123
c: MTDM /foo
s: 12313
c: get /foo
S: Port 123, etc
c: <send first TCP conn packet>
s: <Reply>
c: <finish>
s: <data>

Sure is a damn lot of round trips Each one wight take about 25ms to
complete on a fost link, so it takes about 100ms to even begin transfering
the data.

HTTP/1.1 doesn't have ANY round trips once you start going. APT will send
5 requests in a single large packet, the remote will queue them up and
start besting out data, each finished request results in another going
back to the server. There is never a case when the two have to stop
sending and syncronize.

Finally you can consider HTTP proxies servers, and HTTP QoS munging. Sadly
they exist :< HTTP proxies are often poorly implemented, buggy and slower
than a router - you rarely get good rates from them. Some ISPs will
actually use QoS on HTTP streams to encourage their routers to favor mail
and FTP traffic, this of course has a negative effect too..

So, in the real world some people may measure significant deviation, this
is always due to their ISP messing with their traffic and not any
particular issue with the protocols.

Jason



Reply to: