[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: netselect - choosing the best FTP server automatically



On Thu, Jun 25, 1998 at 09:18:24PM -0600, Jason Gunthorpe wrote:

> On Thu, 25 Jun 1998, Avery Pennarun wrote:
[netselect   http://www.worldvisions.ca/~apenwarr/netselect-0.1.tar.gz]

> This is pretty good, but it seems to loose much meaning for me with
> several equal servers,
> 
> ftp1.us.debian.org                     111 ms   16 hops   91% ok (31/34)
> llug.sep.bnl.gov                        83 ms   15 hops  100% ok (110/110)
> ftp.debian.org                          80 ms   19 hops   93% ok (40/43)
> ftp.cdrom.com                           87 ms   13 hops   92% ok (39/42)
> --
> ftp1.us.debian.org                     115 ms   16 hops   93% ok (31/33)
> llug.sep.bnl.gov                        79 ms   15 hops   85% ok (18/21)
> ftp.debian.org                         100 ms   19 hops   90% ok (28/31)
> ftp.cdrom.com                          105 ms   13 hops   90% ok (27/30)
> 
> I'm sitting on a high speed lan conection (ie I get ~100k/s from llug)
> Presumably this program pushes the network a bit hard and that is why
> there is packet loss a straight ping to any of these sites will get 0%
> loss.

There are two big problems I've noticed with the first version (I wrote it
all last night and have no more time to work on it for at least three weeks;
sorry):

- it stops as soon as the hop count of all sites is determined, AND
  the "minimum time" has elapsed.  (The default minimum isn't very strict.)
  Requests that it sends out might not be back before it gives up!

- because of random delays (and because if the TTL attempt is too
  low, it isn't valid for a timing check) the number of packets sent to each
  host varies wildly.  Too-low packet counts can make this really stand out,
  since getting 4/5 packets is much worse than 29/30, even though only one
  packet was lost.

I don't consider the ping time variance a big deal -- because it sends out
lots of packets at once, they will get queued up a bit, but (theoretically,
anyway) since each host is equally affected this effect should mostly
average out.  I'm not surprised by +/- 50 ms, though.

If the two bugs above were fixed, I think this would be usable.  After all,
comparing equal hosts isn't all that useful anyway.  The ones we want to
filter out are the ones with really large lag times (> 500 ms) and large
packet losses (> 10%).

Bandwidth comparison a la bing would also be nice, but I definitely don't
have time to implement that anytime soon.

> My question is how can we build a single weighted score for each site? Let
> the user pick one of the top two or so. But how do you weight?

Assuming that the packet loss and timing statistics were made more
reliable...

I would say that packet loss and delay are of almost equal importance, and
hop count considerably less.  With bugfixes we should be able to attribute
about +/- 5% of packet loss and 50ms of delay to random chance, so we
need a nice binary comparison routine.  Here's a random one:

	- gravitate, say, 3% packet loss to each site, toward the average. 
	  ie. if A has 85% connectivity and B has 95% connectivity, adjust
	  to 88% and 92%.  If the average is less than 3% away, just set
	  both to the average.
	  
	- similarly with 25ms of lag time.  Given 200ms and 300ms, we get
	  225ms and 275ms.
	  
	- Take percentage difference for each and subtract them (note that
	  smaller lag is better, and larger connectivity is better).
	  
	  Hmm...
	  
	           - A -			- B -
	    (88-90)/90 = -2.22%	  	(92-90)/90 = +2.22%
	    (225-250)/250 = -10%	(275-250)/250 = +10%
	    Subtract: 7.78%		Subtract: -7.78%
	    
	  Note that due to the way percentage difference works, we only
	  actually need one of these.

	- Higher score wins.
	  
Disclaimer:  IANAS.  (I Am Not A Statistician :))

Anyone is invited to work on the this since I have very little time to do
so.  Please forward me any improvements or ideas.

Have fun,

Avery


--  
To UNSUBSCRIBE, email to debian-devel-request@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org


Reply to: