[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Questions about URLs for Gopher search items



Greetings Gophernauts,

I have two questions regarding the correct way to format URLs for Gopher
search engines when the search term is to be specified.

The first is: to what extent is RFC 4266 ("The gopher URI Scheme") being
adhered to by the modern Gopher community?  Are we trying to follow it,
or is it being actively ignored in the same way that Gopher+ is being
actively ignored?

RFC 4266 says very clearly in section 2.2 that:

> If the URL refers to a search to be submitted to a Gopher search
> engine, the selector is followed by an encoded tab (%09) and the
> search string. 

This is consistent with the earlier syntax from section 2.1:

> A Gopher URL takes the form:
>
> gopher://<host>:<port>/<gopher-path>
>
> where <gopher-path> is one of:
>
> <gophertype><selector>
> <gophertype><selector>%09<search>
> <gophertype><selector>%09<search>%09<gopher+_string>

However, if I use Lynx to navigate to the Veronica 2 search engine at
Floodgap and do a search for "cheese", then use the = button to get Lynx
to show me the URL of my current location, it tells me I am at:

gopher://gopher.floodgap.com/7/v2/vs?cheese

Note the use of ? instead of %09 to separate the search term from the
selector.

I tried to see what other clients do here to see if there was a rough
consensus, but was surprised to find that very few clients actually
provide a way to get the URL of the current Gopher item!  VF-1 does, but
it doesn't include search terms at all, which is something I'd like to
fix.

Most clients of course do include a way to navigate to a URL, so I was
able to test visiting Veronica result pages using both kinds of syntax.
It seems that most clients work with the Lynx-style URL where the search
term is separated from the selector by a ?, and fail with %09 URLs.
It's not clear to me whether clients are replacing the ? with an
unecoded tab when they send the request to Veronica, or whether Veronica
recognises such requests as an alternative syntax and treats them
equivalent to RFC 1436 compliant requests using a tab separator.

My second question is: what is the correct item type to use for a URL
which includes a search term?  As far as I can see, RFC 4266 is silent
on this matter.  Lynx uses item type 7.  This might initially seem
obviously correct, but I think it's not actually so clear cut.

The only purpose of the item type is to provide a hint to the client as
to how it should interpret the response it gets back from the server
(the server of course is not sent any item type at all).  When a Gopher
search engine is provided with a tab-separated query, it is supposed to
return "the equivalent of a directory listing for documents matching
the search criteria" (RFC 1436 section 3.7).  Thus one could actually
make a case for saying URLs with search queries included in them should
have item type 1.

Assuming the search query is represented with the %09 separator as per
RFC 4266, and the client is smart enough to undo URL encoding before
sending the selector to the server, then using item type 1 would mean
that such URLs "just work" without any special consideration on the
part of the client.  In contrast, using item type 7 means that clients
asked to visit a URL with item type 7 need to examine the path of the
URL for either a ? or a %09 separator in order to decide how to proceed,
whether they prompt the user to input a search term or whether there is
a term already included in the URL.  The former seems a bit more elegant
to me.

Looking forward to hearing what others think about this apparently
little explored corner of Gopher!

Cheers,
Solderpunk


Reply to: