Re: robots.txt (was Re: Download a whole gopherhole using wget/curl?)

To: gopher-project@other.debian.org
Subject: Re: robots.txt (was Re: Download a whole gopherhole using wget/curl?)
From: cimejes <leveck@leveck.us>
Date: Thu, 28 Nov 2019 15:35:45 -0700
Message-id: <[🔎] 89cb4e7e-68ef-bbff-edc7-d41c0409aba8@leveck.us>
In-reply-to: <[🔎] 20191128223304.GC30671@brevard.conman.org>
References: <[🔎] 16eb08176ef.11eef69dc482927.3075840238943420488@kiwidev.xyz> <[🔎] 20191128103133.GB30671@brevard.conman.org> <[🔎] 20191128182223.0117879b@aluminium.mobile.teply.info> <[🔎] 20191128174525.11D43101923F3@r-36.net> <[🔎] 20191128223304.GC30671@brevard.conman.org>

On 11/28/19 3:33 PM, Sean Conner wrote:

It was thus said that the Great Christoph Lohmann once stated:

Greetings.

For the crawling I already proposed for eomyidae and what if
follows some robots.txt for gopherspace.

See gopher://gopherproject.org/1/eomyidae for the details. All
automatic crawlers or archivers should keep to this standard.


   I have a question about this, and it releates to the section I added last
night to my gopherhole.  I read the document given above, and in there, I
read:

	Now put into this file:

User-agent: eomyidae/0.3

         	Disallow: /

Or to disallow all crawlers:User-agent: *

         	Disallow: /

   That follows directly from the standard for HTTP, but gopher isn't HTTP.
I'm asking because very few selectors in my gopherhole start with a '/' [1],
so this doesn't really work for me if I wanted to block all crawlers from my
site (which I don't do [2]).

   But if I wanted to block robots from crawling the black hole I created,
would the following actuallyu work?

		User-agent: *
		Disallow: BlackHole:

   -spc (Who doesn't have a conventional gopherhole ... )

[1]	The only two are:

		/robots.txt
		/caps.txt

[2]	My robots.txt file:

		User-agent: *
		Disallow:


I used this:

User-agent: *
Disallow: /
User-agent: veronica
Allow: /
User-agent: eomyidae/0.3
Allow: /

--
Nathaniel Leveck
gopher://1436.ninja
https://leveck.us

Reply to:

Follow-Ups:
- Re: robots.txt (was Re: Download a whole gopherhole using wget/curl?)
  - From: Sean Conner <sean@conman.org>

References:
- Download a whole gopherhole using wget/curl?
  - From: kiwidevelops <kiwidev@kiwidev.xyz>
- Re: Download a whole gopherhole using wget/curl?
  - From: Sean Conner <sean@conman.org>
- Re: Download a whole gopherhole using wget/curl?
  - From: Florian Teply <usenet@teply.info>
- Re: Download a whole gopherhole using wget/curl?
  - From: Christoph Lohmann <20h@r-36.net>
- robots.txt (was Re: Download a whole gopherhole using wget/curl?)
  - From: Sean Conner <sean@conman.org>

Prev by Date: robots.txt (was Re: Download a whole gopherhole using wget/curl?)
Next by Date: Re: robots.txt (was Re: Download a whole gopherhole using wget/curl?)
Previous by thread: robots.txt (was Re: Download a whole gopherhole using wget/curl?)
Next by thread: Re: robots.txt (was Re: Download a whole gopherhole using wget/curl?)
Index(es):
- Date
- Thread