[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Questions about URLs for Gopher search items

> >From my point of view, there are three separate questions:
> 1. If the URL includes %09, should we send a selector+\t+search string to the server?
> 2. In the same conditions, how should we treat the returned data?
> 3. Should I do anything with a ?
> Once separate, it?s more clear that the right thing to do is to send whatever is the in the URL, and to treat the return value as a directory is the gophertype is either 1 or 7. And the ??? isn?t part of the gopher syntax at all.

I agree that the answers to 2 and 3 are very clear.  RFC 1436 leaves
no doubt that the the response to a request for a type 7 item is a
directory, i.e. the same thing as type 1.  I don't think this has ever
been in doubt and seemingly all clients already do this.  And no special
syntax regarding ? is mentioned in RFC 1436 - in fact there are only two
?s in the whole RFC, and both times they are part of standard natural
language question marking.  Lynx gave me the wrong impression that maybe
the community had adopted ? as a notation for including search terms in
URLs instead of %09, but I was mistaken and this use of ? is actually to
make use of the CGI implementations of various servers.  Clients should
pay no special attention to it.

Regarding question 1, even though I previously argued that using item
type 1 for URLs for search engines when a search term is included made a
lot of sense (arguing mostly from a perspective of client implementation
complexity), I have done more careful reading and thinking and now I
have changed my mind on this.

RFC 1436 is very clear that selectors cannot include tabs:

> Selector  ::= {UNASCII}.
> UNASCII   ::= ASCII - [Tab CR-LF NUL].

If a Gopher client is sending a tab in its request to a server, then it
is by definition sending a selector *plus something else*.  This is
legal in the context of item type 7:

> Full-Text Search Transaction (Type 7 item)
> ...snip...
> C: Opens Connection.
> C: Sends Selector String, Tab, Search String.
> S: Sends Menu Entity.

But not anywhere else, and in particular not for item type 1, where
*only* a selector is sent:

> Menu Transaction  (Type 1 item)
> C: Opens Connection
> S: Accepts Connection
> C: Sends Selector String
> S: Sends Menu Entity

This seems to me to mean that a URL with a %09 in the path but any item
type other than 7 should, strictly, be considered invalid, as it maps to
a gopher transaction not covered by RFC 1436.  Of course, it's obvious
"how to do" such a transaction, and some clients may choose to support
them as a Postel's Law practice, but if we are being strict such URLs
should not be permitted.  RFC 4266 does not state any restrictions on
item type for URLs with search terms, but I think it goes without saying
that where RFCs 1436 and 4266 are in conflict, 1436 wins.

Does anybody see a flaw in this reasoning or find anything in an RFC
which contradicts it?

> Some tests:
> A URL like gopher://gopher.floodgap.com/1/v2/vs%09cheese?head (note the ?) will do a search for cheese?head. 

Based on the above reasoning, this URL is invalid, but for the sake of
what I imagine you were actually aiming to test with this one, yes,
searching for "cheese?head" would be correct here as the ? has no
special meaning.

> A URL like gopher://gopher.floodgap.com/1/v2/vs?cheese is not a search from my POV; it?s just a plain gopher request for /v2/vs?cheese . If the server wants to treat a selector like /v2/vs?cheese as a search, it?s free to.

I completely agree.
> A URL like gopher://gopher.floodgap.com/0/v2/vs%09cheese  (note the 0 instead of a 1) will do the search but will treat the results like a file instead of like a directory. 

Again, I would now argue that this URL is invalid since for a type 0
transaction the client is supposed to send only a selector and a
selector cannot contain a tab.
> Lastly, a url like gopher://gopher.floodgap.com/1/v2/vs%09 (note that there?s no actual search string) will send up to the server the /v2/vs but will not include a tab.

Hmm.  I guess this URL (with item type 7) defines a search for an empty
string.  I don't see that RFC 1436 or 4266 take a clear stance on
whether this is permitted.  But not sending a tab seems like a pretty
sensible course of action.


Reply to: