[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: please use robots.txt for your gopher apps



Cameron Kaiser <spectre@floodgap.com> wrote:
> ..
>
> I still don't have a good idea what to do about hosts with user menus
> and scripts.

There aren't really that many multiuser sites currently so it would
be an easyish hack to make a list of them and throw together an
internal meta robots.txt for those sites.  Alternately the meta
map could be incorporated into whatever community standard emerges.
I think you'd just need to see that something like MUSERS has been
set then use that to dynamically create a hit list:

ex1) set MUSERS var for host sdf.org:

 MUSERS='/users/*' # user gopherspace URLS are of form 'sdf.org/1/users/username/'

ex2) set MUSERS var for host grex.org:

 MUSERS='/~*'   # user gopherspace URLS are or form 'grex.org/~username/'

 where '*' is limited to what's currently acceptable in gopher URLs

A static robots.txt requires the sysadmin to keep up with any
changes to users with active gopher spaces.  Most of these sites
are now running Gophernicus so I suppose it's author could possibly
incorporate something automated once a community standard emerges.

I guess one could also simply check whatever directories are allowed
to be crawled by the root robots.txt for additional robots.txt files.
The files are pretty small and the small additional bandwidth would
invariably be much less than the alternative.

Jeff W.


Reply to: