[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: new, not nice web bots disposal



On Wednesday 26 February 2020 04:05:53 john doe wrote:

> On 2/26/2020 9:57 AM, Gene Heskett wrote:
> > over the last 90 days or so, we seem to have been plauged with a new
> > breed of bots scanning our web pages, and they are not just indexing
> > our web pages I don't mind that, but they are ignoring our
> > robots.txt and are  mirroring anything apache2 can reach, including
> > stuff thats there but not reachable by a normal browser just looking
> > around and clicking on links.  Its annoying as hell and when you're
> > out in the pucker-brush on a 10 megabit ADSL, eats up ones available
> > upload bandwidth of about 275kbytes/s.  According to my cable
> > billing, these A-H's used over 100Gb of my bandwidth in Nov 2019.
> > That describes in printable language as a DDOS in my vocabulary.
> >
> > So I asked a few questions and wrote some little 2-3 line scripts
> > after putting a tail on /var/lib/httpd/other_vhosts_access.log,
> > which logs enough info you can generally identify the bots with it.
> >
> > I have since have generated 49 iptables rules that have blocked 99%
> > of them.
> >
> > Those scripts I've placed in /etc/iptables and are owned by root.
> > To start iptables after a reboot:You might run this one first one
> > from /etc/rc.local
> >
> > root@coyote:iptables$ cat start-iptables
> >
> > #!/bin/bash
> > cd /etc/iptables
> > iptables-restore <rules.v4
> >
> > To add a new rule, covering that whole 256 address block because
> > they seem to have a random address, changed about weekly, in that
> > block:
> >
> > root@coyote:iptables$ cat iptables-add
> >
> > #!/bin/bash
> > iptables -I INPUT -s add.ress.to.block/24 -j DROP
> >
> > Substituting the address of the offender for add.ress.to.block in
> > the last tine above.
> >
> > to save the rules:
> > root@coyote:iptables$ cat iptables-saveem
> >
> > #!/bin/bash
> > iptables-save >rules.v4
> >
> > To see what you've got so far:
> > root@coyote:iptables$ cat iptables-status
> >
> > #!/bin/bash
> > iptables -L -nv --line-numbers
> >
> > Which will output the rules in effect plus the hits accumulated in
> > this uptime so far, in this format:
> > lnum   hits  bytes fate
> > 24     846   50760 DROP  all  --  *  *   66.249.64.0/24   0.0.0.0/0
> >
> > Be my guest folks, reclaim the net, we are paying for the bandwidth
> > these jerks are burning up.
>
> The above is the way the OP has choosen to go about it but configuring
> apache properly using fail2ban in addition of the robot.txt file.
> should also be considered

That was also suggested and tried, for about a week, but there were no 
failures to initiate the ban from. So I got out a bigger hammer.  This 
works for us little folks. If those bots want to be nice and just index 
my pages, I'd not have a problem as it would help advertise that I might 
have something they'd want. But they want to mirror the whole site, then 
each one wants to do it again half a week later, before they've got the 
first full copy, and theres several hundred of them doing it. That is a 
DDOS when they burn up all my upload bandwidth on a 24/7 basis. As an 
old friend of mine, JoAnne Dow has said, trying to be a lady, in this 
case years before Linux was written, screewwuum.  We were still useing 
Amiga's and a web browser called miami in those days, before that pots 
and a 300 baud modem making calls from a trs-80 color computer 1 with a 
grand total of 64k for memory.

We were talking about a WD hard drive whose firmware could not handle a 
bigger than one 256 byte sector write at a time. ISTR it was a 15 
megabyte drive, then currently state of the art. WD's response? Works on 
dos, shrug.

What I fail to understand is why the universal reticence on the part of 
all the knowitalls here to use the almost 20 year old tools given us to 
control these jerks? Particularly when it looks like its the only tool 
that actually and effectively works.

> See past threads from this OP for an history of this.
>
> --
> John Doe

Who is John Doe, and how fast was your first modem, mine was 300 baud. 
And it was a long distance call 18 miles to a delphi access point. Long 
before Judge Green.

Cheers, Gene Heskett, who is not hiding behind an alias.
-- 
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
If we desire respect for the law, we must first make the law respectable.
 - Louis D. Brandeis
Genes Web page <http://geneslinuxbox.net:6309/gene>


Reply to: