[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [Nbd] tech documentation

On Thu, Oct 19, 2006 at 01:09:25PM +0200, Wouter Verhelst wrote:

| On Thu, Oct 19, 2006 at 12:15:42PM +0200, Wouter Verhelst wrote:
| > On Wed, Oct 18, 2006 at 02:45:20PM -0500, Phil Howard wrote:
| > > Also, are there any plans to implement NBD over SCTP?
| > 
| > Not at this time. Patches won't be refused, though, provided I
| > understand what the benefit would be (which I don't right now).
| Err, SCTP is that protocol that provides connectionless reliable
| networking, right? Okay, that would be a good idea, but it would still
| require patches from someone :)

No.  That's RDP.

SCTP is not connectionless.  They just call it an association because
the semantics are different than a TCP connection.  But in important
ways, the association still has the value of a connection.  For example,
when all processes holding the file descriptors close those descriptors
or otherwise just go away, the system network stack will shutdown the
association.  That's something UDP cannot do.

SCTP enjoys the principle advantages of TCP, but has other advantages
as well.  While TCP blends all sends/writes into a logically continuous
stream of octets where the boundaries between the chunks of data sent
or written are lost when queued together, SCTP retains those.  They can
be ignored in applications where an octet stream is sufficient.  But
they can also be used where those boundaries have semantic significance
without having to layer the TCP stream with some kind of record or chunk
boundary scheme.

An SCTP association can also be set up redundantly between multiple
IP addresses, allowing it to recover gracefully from loss of some
network paths.  This is a bit more complicated to use, though.

SCTP also supports multiple data streams.  The advantage is the stack
knows about these data streams and performs lost/misordered packet
recovery independently for each stream.  A good example of how this can
benefit some applications is the IRC multi-channel chatting network.
Placing each channel of conversation in a separate SCTP stream means
that a lost packet containing a few messages for a few channels will
not hold up all the messages for all the other channels.  Since there
is no order relationship between channels, this helps avoid congestion.
This is one of the reasons TCP is a poor transport choice for things
like IP tunnels (it can and does work, but the entire tunnel pauses
when any packet is lost because the peer stacks are enforcing the
exact order of the whole octet stream).

It _MAY_ be of benefit to NBD (one of the reasons I wanted to know more
about the protocol).  It can allow multiple requests that are order
independent to be queued from the client to the server and not slow
down all of them if one packet is lost, while the lost one is being
recovered by the peer stacks.  Order is retained within a stream,
but is not enforced between them.

SCTP emerged from the needs to do voice telephony trunks over the IP
stack.  Wisely, it was created as a general purpose layer rather than
being telephony specific.  It simply incorporated the features that
telephony needed, which include minimizing latency and offering a more
reliable transport.

For those interested in more on SCTP:


| If anyone is going to do this, make sure it can be disabled at compile
| time as well as at run time. Not every operating system supports SCTP
| yet, and I imagine you may also want to have a server that supports both
| SCTP and TCP clients at the same time. Also, patches would have to be
| against svn trunk, not against whatever is the latest "stable" version
| out there.

By all means there should be an option, not only in compiling the tools,
but also in the kernel configuration.  There may also need to be an
option at kernel config time, compile time, and/or run time to place an
upper limit on the number of SCTP streams.  I cannot judge, yet, what
a reasonable default would be.

FYI, I am planning to look into implementing an NBD client directly into
the QEMU emulator in system mode.  This would allow an emulated system
to have one or more block devices mapped onto the network block devices
served from nbd-server anywhere without using NBD in the host kernel.
And the nbd-server can even be on the same machine running QEMU without
the deadlock issues (nbd-server runing in the guest OS under QEMU would,
of course, still have those issues).  This is another reason I wanted to
see technical protocol details.

| Phil Howard KA9WGN       | http://linuxhomepage.com/      http://ham.org/ |
| (first name) at ipal.net | http://phil.ipal.org/   http://ka9wgn.ham.org/ |

Reply to: