[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Stopping webcrawlers.



On Sun, Nov 03, 2019 at 10:04:46AM -0500, Gene Heskett wrote:
> Greetings all
> 
> I am developing a list of broken webcrawlers who are repeatedly 
> downloading my entire web site including the hidden stuff.
> 
> These crawlers/bots are ignoring my robots.txt

$ wget -O - https://www.shentel.com/robots.txt
--2019-11-03 15:22:35--  https://www.shentel.com/robots.txt
Resolving www.shentel.com (www.shentel.com)... 45.60.160.21
Connecting to www.shentel.com (www.shentel.com)|45.60.160.21|:443...  connected.
HTTP request sent, awaiting response... 403 Forbidden
2019-11-03 15:22:36 ERROR 403: Forbidden.

Allowing said bots to *see* your robots.txt would be a step into the
right direction.

Reco


Reply to: