Re: rogue Chinese crawler
On Sat, 24 Nov 2001, Hereward Cooper wrote:
> > Despite what I put in any robots.txt, this one disregards all rules and
> > just jams up my system, downloading every damn' thing in sight.
> > Mails to the owners are totally disregarded.
> Have you actually seen:
> It talks about the robot and how to get it to stop accessing your site.
Hereward -- did you read my first paragraph?
(But to answer your question -- yes -- _of course_ I've actually seen
the robot.html page.)
The site tacitly admits that people are having difficulty getting rid of
the 'bots -- and DESPITE applying every fix so far suggested to me, this
is from tonight's access.log:
robot12.openfind.com - - [24/Nov/2001:18:38:27 +0000] "GET
/familycentury/twins/ HTTP/1.0" 200 3054 "-" "Openfind data gatherer,
(Mind you, I may slowly be working my way through killing off 16
different 'bots, but I'm still leery.) I seem to have reduced it to 5
hits per attack -- instead of 45 minutes continuous lockup, as
experienced yesterday -- but it seems to be ignoring all attempts to
keep it out of my system.
>From off-list correspondence I know I'm not the only victim, either.
Degree of annoyance varies from site to site (possibly dependent on
overall site setup).