[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Recent page visit statistics



* Thomas Lange <lange@cs.uni-koeln.de> [250723 10:56]:
I've prepared a list of user-agents bots when I've
analysed the www.debian.org logs. It will not cover all, but most bots
that send their user-agent string. I did not tried to exclude IP addresses,
but there's a list of good bots: https://github.com/AnTheMaker/GoodBots

This is my regex file for grep -vf to exclude some bots:


Wget
curl/
[..]

From what I've seen in other places, the AI scrapers send (semi-)real User-Agents, mimicing Chrome, Firefox, MSIE, etc.

Chris


Reply to: