[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Debian mirrors and search engines policy


On Wed, Feb 08, 2012 at 10:40:44PM +0100, Jogi Hofmüller wrote:
> While browsing the stats of our small debian mirror [1] today I noticed
> that according to awstats, search engines produce more load (size-wise)
> than users of the mirror.  Interestingly the ratio between search
> engines and users is totally different if you compare IPv4 and IPv6.
> The IPv6 side of the mirror has no search engine traffic worth
> mentioning, not even google pays a visit ;)

Are you sure awstats reports the amount of transferred data? I guess
Google just tries to download the file and stops after a few bytes,
which might be logged as a full download. At least this behaviour is
something I remember from using awstats (not on my mirror, though).

> Not only do I think this overhead in traffic in favor of search engines
> is annoying, I also think it's useless (at least from my personal
> experience in using debian).  So I am curious how others handle this
> issue and/or if anyone has a policy regarding search engines.  I know
> that not all robots do respect the robots.txt file, but maybe it's of
> use here.

We block:

Carsten Otto           otto@informatik.rwth-aachen.de
LuFG Informatik 2      http://verify.rwth-aachen.de/otto/
RWTH Aachen            phone: +49 241 80-21211

Attachment: signature.asc
Description: Digital signature

Reply to: