Re: robots.txt (was Re: Download a whole gopherhole using wget/curl?)
Greetings.
On Fri, 29 Nov 2019 06:19:46 +0100 Sean Conner <sean@conman.org> wrote:
> I have a question about this, and it releates to the section I added last
> night to my gopherhole. I read the document given above, and in there, I
> read:
>
> Now put into this file:
>
> User-agent: eomyidae/0.3
> Disallow: /
>
> Or to disallow all crawlers:
>
> User-agent: *
> Disallow: /
>
> That follows directly from the standard for HTTP, but gopher isn't HTTP.
> I'm asking because very few selectors in my gopherhole start with a '/' [1],
> so this doesn't really work for me if I wanted to block all crawlers from my
> site (which I don't do [2]).
>
> But if I wanted to block robots from crawling the black hole I created,
> would the following actuallyu work?
>
> User-agent: *
> Disallow: BlackHole:
Good point. In eomyidae you have two possibilities:
User-Agent: *
Disallow: *
and
User-Agent: *
Disallow:
I have changed the eomyidae hole to clarify this.
But it already has a special case for »/«, to not crawl anything.
Sincerely,
Christoph Lohmann
💻 https://r-36.net
💻 gopher://r-36.net
☺ https://r-36.net/about
🔐 1C3B 7E6F 9805 E5C8 C0BD 1F7F EA21 7AAC 09A9 CB55
🔐 http://r-36.net/about/20h.asc
📧 20h@r-36.net
Reply to: