[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

[gopher] gopher++ (gopher1) protocol




I think I promised to write down my thoughts about the gopher++ extensions to the original gopher0 protocol. I was going to implement it first, make sure it works and only then document everything and send the explanations to this list.... but I'm in a middle of major rewrite to kgopherd so the implementation has to wait for a week or so. Besides, even if I got kgopherd up and running with gopher++, I'd still be missing a client (which I'll write, eventually).

So here goes, these are just mostly untested thoughts (that I WILL try out as soon as possible). This is written in somewhat rfc-like offical format... but only somewhat.

===============================

Gopeher++ protocol
Kim Holviala <kim@holviala.com>


== A primer to the original gopher0 ==

The original gopher0 protocol from rfc1436 is as follows (C for client traffic, S for server replies):

C: <opens the connection>
C: /path/to/resource
S: <dumps the resource to the client>
S: <closes the connection>

The "resource" above can be either a file, or a specially formatted gopher menu (see rfc1436). Menus are just simple text files which describe the file resources that can be downloaded. Menus also tell the client what type of a file the resource is (text, image etc).

There are a few major problems with that simplistic approach, mainly:

1) the server doesn't know if the client wants a file, or a menu
2) the client doesn't know if it's getting a file, or a menu
3) the client has no knowledge of the charset of the file/menu
4) the client doesn't know what type of a file it's getting

Problems 1 & 2 don't seem that important at first, but consider this: the server has a menu and a file. Both are (accidentally) removed from the server but the clients keep requesting them. What kind of an error page will the server generate? A menu, or just plain text? The server cannot know since it doesn't know what the client expects.

Problem number 3 is a common one across all protocols. Server uses charset A while the client wants charset B. That's all fine, except that gopher0 cannot transfer the encoding information between the server and the client.

Problem number 4 is an interesting one. If the resource (file) is tagged by the server as type g, the client can be fairly certain that it's getting a GIF image. Except that when the image is removed and the server sends the error message in menu format. If resource is tagged as I (generic image), the server can send out pretty much anything and the client has no idea what it's getting.


== How HTTP solves these problems ==

Gophers rival HTTP has solved these problems, in a way. In HTTP the client asks for a resource, and the server gives it back a description of the data, and the actual data. Then it's the clients responsibility to figure out if the data it got back has anything to do with the data it requested.

This is really simple for the server as it can dump pretty much anything to the client as long as it's documented (with Content-Type et all). But it's a pain to the client as it needs to understand every file format and charset in the world (as it has no idea what it's getting). Hence the size of modern web browsers.


== gopher++ (gopher1) protocol ==

To solve these problems with gopher I'm suggesting the following extensions which I call gopher++ (or gopher1).

A gopher++ transaction goes like this (again, C for client and S for server):

1 C: <opens the connection to gopher.holviala.com>
2 C: /path/to/resource
3 C: Host: gopher.holviala.com
4 C: Accept-Charset: UTF-8
5 C: Accept: text/plain
6 C: Referer: gopher://gopher.holviala.com/t/path/to/menu
7 C: User-Agent: gopher++/0.1
8 S: <dumps the resource to the client>
9 S: <closes the connection>

Lines 1 & 2 are identical to the original gopher0 protocol, and so are lines 8 & 9. This makes gopher++ 100% backwards- and forwards-compliant with gopher0. A gopher0 server never reads the additional headers the gopher++ client sends. A gopher0 client connecting to gopher++ server gets back the resource just as it would get it back from an older gopher0 server.

If a gopher++ client is talking to a gopher++ server then the extra headers come into effect.

The Host: header tells the server the original hostname the client was connecting to. This header allows the server to serve multiple hostnames under one IP address (virtual hosting). A gopher++ client MUST always send the Host header.

The Accept-Charset: header tells the server what charset the client can handle. If a client sends the Accept-Charset header, the server MUST send its reply using the charset speficied. If the resource has no meaning of charset, the server can ignore this header. If the client does not send the Accept-Charset header, or if the server doesn't recognize the charset the client requested, the server MUST serve the resource using the 7bit US-ASCII charset (if applicable).

The Accept: header tells the server which type of content the client expects. If the client sends the Accept header to the server, the server MUST send its reply using the format the client requested. If the client does not send the Accept header, if the server doesn't recognize the content-type client requested, or if the client requests a content-type of "application/octet-stream", the server must serve the resource in its original format, or format it thinks the client expects.

The optional Referer: header tells the server from which URL the client came from. This header is purely for server's benefit and the client can refuse to send it for privacy reasons.

The User-Agent: header contains the client application name (and possibly version). Clients SHOULD send this header as it helps servers track down misbehaving clients.


== Accept-headers and transcoding ==

The Accept-Charset and Accept-headers in gopher++ require some more explanation. As said above, if those headers exist, the server MUST obey them. As the client can not know the kind of data it's getting back from the server, it must rely on the server to send exactly what was being requested.

For Accept-Charset, if the server does not have the resource in the correct charset the server MUST transcode the textual information to the charset the client requested (if applicable). This moves the burden of charset conversions from the client to the server. In gopher++ the client never has to transcode textual information from one charset to another.

For the Accept-header, if the server does not have the resource in the correct content-type, if at all possible the server MUST transcode the content to fit client requirements. This requirement makes clients small and fast as they do not have to carry support for all possible resource formats, nor do clients have to be recoded to recognize completely new formats.

The server MUST be able to offer all plain text information (text files and gopher menus) in US-ASCII, Latin-1 and UTF-8 charsets. A client SHOULD not request for anything else than the same three charsets.

The server MUST be able to convert all image resources to GIF, PNG and JPEG formats. A client SHOULD not request for any other format than those three.

The server SHOULD be able to convert PDF and PostScript resources to the any of the above three image formats and to plain text. A client SHOULD not ask for anything else.

The server SHOULD be able to convert all audio resources to either WAV, MP3 or OGG Vorbis. A client SHOULD not ask for any other format.

The server SHOULD be able to convert all video resources to either MPEG or OGG Theora. As video transcoding is CPU-intensive and video formats are a moving target, the server is not obligated to obey client requests for video formats. A client SHOULD not ask for anything else than MPEG or OGG Theora, or "application/octet-stream" if it wants the original video stream.


== gopher0 filetypes and request content-types ==

A table of old gopher0 filetypes and their matching gopher++ mimetypes. The video filetype is "v" instead of the commonly used ";".

For example, if a gopher0 menu specifies that a resource is of type "0", a client SHOULD not ask for anything else than application/gopher-menu. If the resource is of type "p", the server must be prepared to convert the pdf file to an static image or plain text. In all cases the client can ask for "application/octet-stream" in which case the server sends the resource as is.

gopher0  content-types
=======  =============
  0      text/plain
  1      application/gopher-menu
  7      application/gopher-menu
  9      application/octet-stream
  g      image/gif, image/png, image/jpeg
  h      text/html, text/plain
  I      image/gif, image/png, image/jpeg
  p      application/pdf, image/*, text/plain
  s      audio/wav, audio/mpeg, audio/ogg
  v      video/mpeg, video/ogg


== Examples ==

These examples lack the optional Referer: and User-Agent: headers for clarity.

Client requests the root menu:
C: <opens the connection>
C:
C: Host: foo.bar
C: Accept-Charset: UTF-8
C: Accept: application/gopher-menu
S: <sends the menu in UTF-8>
S: <closes the connection>

Client uses an external PDF reader:
C: <opens the connection>
C: /doc/document.pdf
C: Host: foo.bar
C: Accept-Charset: US-ASCII
C: Accept: application/octet-stream
S: <dumps the original pdf, will NOT transcode to US-ASCII>
S: <closes the connection>

Clients wants to show the PDF in it's own window as text:
C: <opens the connection>
C: /doc/document.pdf
C: Host: foo.bar
C: Accept-Charset: US-ASCII
C: Accept: text/plain
S: <converts the pdf to US-ASCII text and dumps it>
S: <closes the connection>

Client doesn't know how to handle jpeg images:
C: <opens the connection>
C: /images/image.jpeg
C: Host: foo.bar
C: Accept: image/gif
S: <converts the jpeg to gif and dumps the result to the client>
S: <closes the connection>

Client is just being stupid:
C: <opens the connection>
C: /doc/rfc1436.txt
C: Host: foo.bar
C: Accept-Charset: Latin-1
C: Accept: image/png
S: <dumps the original rfc and ignores the image conversion request>
S: <closes the connection>

Client is for deaf people:
C: <opens the connection>
C: /doc/rfc1436.txt
C: Host: foo.bar
C: Accept-Charset: Latin-1
C: Accept: audio/mpeg
S: <may convert the document to audio and then dump the mp3>
S: <closes the connection>

Client is requesting a resource that doesn't exist:
C: <opens the connection>
C: /doc/
C: Host: foo.bar
C: Accept-Charset: UTF-8
C: Accept: application/gopher-menu
S: <sends an error message as gopher0 menu>
S: <closes the connection>

Client is requesting an image that doesn't exist:
C: <opens the connection>
C: /images/missing.jpeg
C: Host: foo.bar
C: Accept: application/octet-stream
S: <either sends the error as an jpeg image, or sends nothing>
S: <closes the connection>








_______________________________________________
Gopher-Project mailing list
Gopher-Project@lists.alioth.debian.org
http://lists.alioth.debian.org/mailman/listinfo/gopher-project




Reply to: