[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: swamp rat bots Q



On Friday 04 December 2020 16:14:29 Tixy wrote:

> On Fri, 2020-12-04 at 14:51 -0500, Gene Heskett wrote:
> > On Friday 04 December 2020 12:39:24 Reco wrote:
> > >       Hi.
> > >
> > > On Fri, Dec 04, 2020 at 08:39:42AM -0500, Gene Heskett wrote:
> > > > But I asked specifically how to enable it for one bot, and I've
> > > > asked that question several times, getting smoke and mirror
> > > > answers you all assume are helpfull, but which are useless to a
> > > > new user installing the now 7 years old and long out of date
> > > > package that in effect has no "how it works" docs. I asked 3
> > > > questions in a previous day or so timeline, and no one has
> > > > actually attempted to actually answer even one of them. Here is
> > > > one line from that log: and that I just blocked:
> > > >
> > > > coyote.coyote.den:80 192.99.6.226 - -
> > > > [04/Dec/2020:07:18:20 -0500] "GET
> > > > /gene/toolshed/c3/build/win32/prep/?C=S;O=D HTTP/1.1" 200 673
> > > > "-" "Mozilla/5.0 (compatible; MJ12bot/v1.4.8;
> > > > http://mj12bot.com/)"
> > >
> > > Taken directly from the link.
> > >
> > > Bot Type         Good crawler (always identifies itself)
> > > IP Range         Distributed, Worldwide
> > > Obeys Robots.txt *Yes*
> >
> > Sorry, they do not, they've read it and ignored it 428 times in the
> > life of that log which I zeroed out around 1 July of this year.
>
> Why would they read it if they we're going to just ignore it, perhaps
> your robots.txt is broken? Hint, it is, in 2 or 3 different ways I can
> see (if it's http://geneslinuxbox.net:6309/robots.txt we're talking
> about). That file doesn't have any syntactically correct entry in
> there for blocking that bot.

And what might that be like, I'll fix it right now

> I don't know why you seem set on blaming malice on part of a bot whose
> front web page has sections like:

The evidence I have collected so far indicates they don't care who they 
ddos with their repeated suckage.  So I never considered allowing their 
site anything like direct access by going to it.  Their actions speak 
MUCH louder than the words below.

>    How can I block MJ12bot?
>    How can I slow down MJ12bot?
>    What commands in robots.txt does MJ12bot support?
>    Why did my robots.txt block not work on MJ12bot?
>
> The URL for that page is in the user-agent string from the log snippet
> you posted above.


Cheers, Gene Heskett
-- 
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
If we desire respect for the law, we must first make the law respectable.
 - Louis D. Brandeis
Genes Web page <http://geneslinuxbox.net:6309/gene>


Reply to: