[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

robots.txt (was Re: Download a whole gopherhole using wget/curl?)



It was thus said that the Great Christoph Lohmann once stated:
> Greetings.
> 
> For the crawling I already proposed for eomyidae and what if
> follows some robots.txt for gopherspace.
> 
> See gopher://gopherproject.org/1/eomyidae for the details. All
> automatic crawlers or archivers should keep to this standard.

  I have a question about this, and it releates to the section I added last
night to my gopherhole.  I read the document given above, and in there, I
read:

	Now put into this file:
  
        	User-agent: eomyidae/0.3
        	Disallow: /
  
	Or to disallow all crawlers:
  
        	User-agent: *
        	Disallow: /

  That follows directly from the standard for HTTP, but gopher isn't HTTP. 
I'm asking because very few selectors in my gopherhole start with a '/' [1],
so this doesn't really work for me if I wanted to block all crawlers from my
site (which I don't do [2]).

  But if I wanted to block robots from crawling the black hole I created,
would the following actuallyu work?

		User-agent: *
		Disallow: BlackHole:

  -spc (Who doesn't have a conventional gopherhole ... )

[1]	The only two are:

		/robots.txt
		/caps.txt

[2]	My robots.txt file:

		User-agent: *
		Disallow:


Reply to: