[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Traffic control



On Fri, Dec 13, 2002 at 06:03:38PM +0800, Bernard Blackham wrote:
> On Thu, Dec 12, 2002 at 10:08:23AM -0500, Marco Antonio wrote:
> > Now we are facing a problem: some people are making 'automated
> > searches' on our www server -an ugly IIS5 :), and we intend to
> > block this kind of search.
> 
> Check your logs and see what the User-Agent header is for those
> requests - you may be lucky and have a hand full like "WebSpider" or
> "Googlebot" or similar. 
> 
> If this is the case, you can drop squid (or another proxy if you
> prefer) on your firewall, set it up to transparently proxy for the
> web server, and tell squid to deny requests with those User-Agent
> headers. If you're lucky! ;)

if that is the case, putting up a proxy server is overkill. Every
reasonable bot checks for a robots.txt in the server's root
directory. By creating this and asking them not to catalogue your
website you should be able to keep them from doing this.

Ciao, Arne.
-- 
 ,``o. OpenBSD        -        Debian GNU/Linux        -        Solaris  >o)
>( ,c@ GPG 1024D/913C2F81 2000-10-11  Arne P. Boettger <apb@createx.de>  /\\
 ',,,' Fingerprint = 6ED9 9A64 CD8A EB6F D841  0391 2F08 8F86 913C 2F81 _\_V

Attachment: pgpSyOLbC0Kin.pgp
Description: PGP signature


Reply to: