Re: [gopher] Spidering teh gopherspace
On 2010-04-14 17:41, John Goerzen wrote:
You may be interested in
http://git.complete.org/z-old/gopherbot/
though my goal was somewhat different; I aimed to archive all of
gopherspace, not search it.
I'm actually doing both, searching and caching (not really archiving).
After giving it a thought I figured out that indexing offline content is
easier - I can reindex as many times as I want without redownloading.
I'd take a look at your code, but it's in Haskell and uses a database....
Actually, the reason I'm doing the search engine is that I've ALWAYS
wanted to do my own search engine. And spidering the gopherspace is
actually fun, unlike spidering the web...
I believe it was actually successful; I
burned some DVDs and shared them with a few people on this mailing list
at the time. IIRC it was about 5 DVDs.
Thanks for that info - I was wondering about how much disk space I
should allocate for the cache...
- Kim
_______________________________________________
Gopher-Project mailing list
Gopher-Project@lists.alioth.debian.org
http://lists.alioth.debian.org/mailman/listinfo/gopher-project
Reply to: