[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Questions about URLs for Gopher search items



On Mon, Oct 28, 2019 at 03:11:16PM -0700, Cameron Kaiser wrote:
> > > A URL like gopher://gopher.floodgap.com/0/v2/vs%09cheese  (note the 0
> > > instead of a 1) will do the search but will treat the results like a file
> > > instead of like a directory. 
> > 
> > Again, I would now argue that this URL is invalid since for a type 0
> > transaction the client is supposed to send only a selector and a
> > selector cannot contain a tab.
> 
> I'm not sure I'd agree wit this. A CGI/mole/etc. on the other end may well
> want arguments, especially for documents that update dynamically. Is this
> in RFC 1436?

Well, not explicitly: RFC 1436 makes no mention of URLs or URIs, of
course, because it predates the concepts.  But it's a fairly short chain
or argument.  RFC 4266 says this in section 2.1:

> A Gopher URL takes the form:
> 
> gopher://<host>:<port>/<gopher-path>
> 
> where <gopher-path> is one of:
> 
> <gophertype><selector>
> <gophertype><selector>%09<search>
> <gophertype><selector>%09<search>%09<gopher+_string>

So there is no question that the URL
gopher://gopher.floodgap.com/0/v2/vs%09cheese points to an item of type
0.  And RFC 1436 is pretty clear about how you access an item of type 0
(from the appendix):

> TextFile Transaction (Type 0 item)
> 
> C: Opens Connection.
> S: Accepts connection
> C: Sends Selector String.
> S: Sends TextFile Entity.
> 
> Connection is closed by either client or server (typically server).

i.e. the only the the client sends is a "Selector String".  And RFC 1436
is pretty clear about what a Selector is:

> Selector  ::= {UNASCII}.
> UNASCII   ::= ASCII - [Tab CR-LF NUL].
> Tab       ::= ASCII Tab character.

i.e. a selector does not include a Tab.

So, if a client is sending a Tab anywhere in its request, then by
definition it is not sending *only* a Selector, which means that by
definition it is not performing a "TextFile Transaction", which means
that by common sense it had better not be doing this in an effort to
access an item whose URL encodes an item type of 0.  The only consistent
conclusion seems to be that URLs with item codes other than 7 with a %09
in the path are invalid.

Of course, as you say, it's clear that not following this strictly could
be useful in practice, and I suspect that there are probably servers
and/or clients in use where dynamic type 0 items of exactly the kind you
described work.  This just seems to me to be what the strictest possible
reading of the RFCs implies.  I don't necessarily think it's a "nice"
result; A principle of "undo URL encoding and send what you get to the
server", as espoused earlier by Comrade Lohmann, is simpler and more
elegant.  If we want to adhere very strictly to RFCs, it seems that
dynamic type 0 items need to rely on the use of ? in the selector to
specify arguments.

Cheers,
Solderpunk


Reply to: