[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Download a whole gopherhole using wget/curl?



It was thus said that the Great kiwidevelops once stated:
> Hi everyone,
> 
> I want to archive as many gopherholes as I can, just in case any of them
> one day shut down or their server stops running and would like to know how
> I can download a gopherhole recursively. 

  As I'm wont to do on the Gemini protocol mailing list, I often create
server content that makes a point [1].  I've done the same here, so show
that there are traps for the unaware.  If you attempt to crawl this link
[2]:

	gopher://gopher.conman.org/1BlackHole:

You'll enter into a space with an infinite number of pages.  That is, until
1) your system runs out of space, or (most likely) 2) the client errors
because it can't handle selectors beyond a certain size [3].  This is just
one example of dynamic content generation.  I don't think there's much of
this in gopher (excluding search output) but it does exist. [4]

  And as some others have pointed out, some gopherholes are rather large in
size.

> Does anyone know how to properly back up a whole gopherhole? Thank you!

  Ask the site owner politely for a copy of the content?

  -spc

[1]	Or tortures client programs, take your pick.

[2]	If your client attempts get get "/BlackHole:" it's doing things
	wrong.

[3]	And these selectors grow ... up to 16,400 or so bytes long.  RFC-1436 says
	nothing about the length of selectors.

[4]	My site:

		gopher://gopher.conman.org/

	is nearly entirely dynamically generated.  For instance:

		gopher://gopher.conman.org/1Bible:
		gopher://gopher.conman.org/0Bible:Genesis
		gopher://gopher.conman.org/0Bible:Genesis.1
		gopher://gopher.conman.org/0Bible:Genesis.1:1
		gopher://gopher.conman.org/0Bible:Genesis.1-3
		gopher://gopher.conman.org/0Bible:Genesis.1:1-3:24
		gopher://gopher.conman.org/0Bible:Genesis.2:5-17

	All of these are valid pages.  And given I have the entire King
	James Bible here, that's a ton of potential pages.


Reply to: