[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

question about snapshot.debian.org

Hello Peter, nice to virtually meet you!
I'm Luigi a sysadmin that work for sysdig. I saw that you are the developer and maintainer of snapshot.debian.org, i'm writing a krawler to get all the old debian linux-image and linux-kernel deb packages to be able to pre-compile a kernel probe for the sysdig project.

I noticed that the krawler is really slow and I did some profiling with cprofile (i'm using python). 

The most amount of time is spent in the open function to grub the HTML from the website.
I was wondering if there are actions on you side that you can take to improve the performances of the website like add a CDN or a varnish cache o spot some bottleneck that you may have on your side?

Here an example of the time spent from an AWS instance on us-east-1 region to grub a page from snapshot.debian.org (as you can see it took 20s):
[root@ip-10-10-1-128 ~]# curl -o /dev/null http://snapshot.debian.org/package/linux/4.6~rc3-1~exp1/

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current

                                 Dload  Upload   Total   Spent    Left  Speed

100 1255k    0 1255k    0     0  61728      0 --:--:--  0:00:20 --:--:--  337k

Looking forward to your reply.
“The only way to get smarter is by playing a smarter opponent.”

Reply to: