Re: swamp rat bots Q
On 2020-12-03 13:35, Gene Heskett wrote:
I've had it with a certain bot that that ignore my robots.txt and
proceeds to mirror my site, several times a day, burning up my upload
bandwidth. They've moved it to 5 different addresses since midnight.
I want to nail the door shut on the first attempted access by these AH's.
Does anyone have a ready made script that can watch my httpd "other" log,
and if a certain name is at the end of the line, grabs the ipv4 src
address as arg3 of the line, and applies it to iptables DROP rules?
Or do I have to invent a new wheel for this?
Basic rules that simplify it somewhat.
1. this is ipv4 only country and not likely to change in the future
2. the list of offending bot names will probably never go beyond 50, if
that many. 5 would be realistic.
3. the src address in the log is at a fixed offset, obtainable with the
bash MID$ but the dns return will need some acrobatics involving the
bash RIGHT$ function.
4. it should track the number of hits, and after so many in a /24 block,
autoswitch to a /16 block in order to keep the rules file from
Any help will be much appreciated. PM's in this case welcome as I can't
see broadcasting our armament against these MF'ers being broadcast on a
Cheers, Gene Heskett
Let me offer you an alternative option. (Most) bots work by analysing
the referrals on each page in your website. Right? So, why not add a
link to a page that normal users will never visit (e.g. because they do
not see the link and thus will never click on it), but will show up in a
bot's analysis? That way you can monitor your logs for entries
containing that page. Every entity requesting that specific URL is blocked.