[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: I am proposing a feature in apt-get



Hi,

Quoting Rani Ahmed (2015-02-13 09:42:00)
> I live in Beirut, Lebanon.
> 
> So the connection speed is usually small. The speed that an individual in
> Europe can subscribe or buy  in Lebanon is distributed to hundreds of people.
> The use squid web caching proxy and tcng (the traffic shaping and control tool)
> .
> 
> In the configuration of squid and tcng , there is something called the burst. I
> hope you know it. and what I know about it is that it allows you to download or
> read some kilobytes at full speed. Then if a connection downloads more than the
> burst size, it will slow it down. And That's what the ISPs usually do. I used
> to work in a small ISP before the DSL system is applied all over Lebanon by the
> government. but there are still people who uses the small ISP method because
> they don't want to pay allot.

Okay.

>     > So, this is what I think that apt-get should do as long as it
>     displays the
>     > total sum of mega bytes to download before the download starts.
> 
>     What has "displaying the total sum of mega bytes" to do with the problem?
>     Because it gives you the sum/total download size of all packages to be
>     downloaded, this means that apt-get knows the size of each package to be
>     downloaded. therefore you can sort the files according to their sizes and
>     download the small ones first.

Yes, apt knows the individual sizes, so theoretically this is possible.

I still do not understand how "downloading the small ones first" influences the
shaping that is done by your ISP.

Imagine apt would have to download five packages A, B, C, D and E out of which
A, C and E are below 100 kb. If it would download it sorted by size, then it
would first download A, C end E and then B and D. If you are right, then your
ISP would not throttle the download of A, C and E because they are each smaller
than 100 kB. But then it would trottle the downloads of B and and D because
they are larger. This also assumes that apt opens a new TCP connection for each
download and does not use HTTP Keep-Alive. I'm not sure which one apt does
these days.

My confusion about the sorting is because of the following. Imagine A, B, C, D
and E were downloaded in exactly that order. Would your ISP then not download A
at full speed, throttle during B, download C at full speed again because it's
small, throttle during D and download E at full speed again because it's small?

Or is there a time window within which you must not have done any "big"
downloads for the throttling to go back?

Is it really that if you would open a new TCP connection for every 100 kb (or
whatever the exact value is) then you could download at a higher speed than
when using a single TCP connection?

I assume that what is actually happening is, that due to TCPs slow-start,
downloading a 100 kb file will never get you to the full speed, so that if you
do multiple 100 kb downloads you will never max out your connection and thus
you will never be throttled. But overall you will be just as fast as if you had
done a single TCP connection and be throttled.

>     > apt-get has to sort the package file sizes and download the smaller
>     >package files first.
> 
>     What does the order in which downloads are done to do with your ISP
>     throttling
>     them? You say your ISP throttles "downloads" of bigger than 100kB. But what
>     you
>     say now also means that it puts each "download" within the context of the
>     "downloads" that happened before and after it. So what *exactly* is your
>     ISP
>     throttling policy?
> 
> 
> NOO! it speeds up the downloads that are smaller than 100KB ? got me? the 100KB
> is just an estimate. I am not sure how much.
> all I am saying is to download small files before downloading big files.
> Just try it and you'll know what I mean.

Unfortunately I cannot try because my ISP has no such policy.

I suggest that you make the following test:

Put 1000 files, each 100kB on a server and download them one after another
using a script and note down the time it took to download them.

Then put a file of 1000000kB on the same server and download that in one go and
note down the time it took.

You can repeat this experiment changing the 100kB for different values to see
how that affects your speed and then create a graph that shows you the optimal
chunk size.

Using this hard evidence I think you could then convince an apt developer to
look into it. But this is only me - maybe some others on this list find this
interesting?

Maybe you instead find an online resource which exactly describes the caching
and throttling policy used by your ISP?

I just think that you first need some harder evidence than your estimation
before anybody puts time into adding more complexity to apt. Maybe you can also
point to others having the same problem to show that you are not the only one?
This would also convince people that this is a real problem for many people.

If you know some C++ you could also write a patch for apt and then show how
with your patch you get a higher download rate than without. This would also be
very convincing.

Also note, that I'm by no means speaking for all apt developers here. This is
only from my own point of view and my own personal opinion.

cheers, josch

Attachment: signature.asc
Description: signature


Reply to: