[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [Nbd] libnbd



Hi,

On 25-04-13 16:49, Goswin von Brederlow wrote:
> On Thu, Apr 25, 2013 at 02:51:21PM +0200, Wouter Verhelst wrote:
>> Hi Goswin,
[...]
>>>> I'm undecided whether the state machine should be primarily linked to
>>>> the socket or primarily linked to a request. In the former case,
>>>> select() would just need to ensure that a state machine is moved from
>>>> the "waiting for data"  to the "ready to read" state (or not touched if
>>>> it isn't in the "waiting for data" state), which would be fairly easy to
>>>> implement and shouldn't have a lot of performance issues, but would make
>>>> handling requests in parallel fairly complicated. In the latter case,
>>>> handling requests in parallel should be fairly trivial (we just read
>>>> requests from a socket and create a new state machine instance), but
>>>> doing so quickly might be an issue.
>>>
>>> Maybe you need more than one. I would think there should be 2 "state
>>> machines" for server and 2 for clients:
>>
>> I should note that while I'm not opposed, in principle, to implementing
>> the client side in this library, it's not where my focus will lie, at
>> least not initially. After all, most useful nbd client implementations
>> will just use the kernel and don't need a state machine of their own
>> (though I'm sure there will be exceptions).
> 
> Isn't the initial negotiation done solely in the userspace client and
> only the requests in kernel? I think at least that part should be in a
> lib.

No. nbd-client is, and will always be, a very small footprint program,
since it needs to be able to run from initrd and similar things. I
intend to use libglib's helper functions extensively inside libnbd; that
would incur a massive footprint increase. That simply is Not An Option(TM).

(I might add much of what nbd-client does as separate calls in libnbd
for the benefit of others, but again, that won't be where my initial
focus would lie)

>>> -----------------------------------------------------------------------
>>>
>>> - server / listener
>>>
>>> This accepts new connects and handles feature negotiation with the client.
>>> After negotiation a new device "state machine" is created.
>>>
>>> - device
>>>
>>> This handles the recieving and replying to requests and the state of a
>>> running connection. A device has a queue of requests and functions to
>>> handle them.
>>
>> I think you misunderstand what I meant when I said "the socket" above.
>> Yes, we'll need multiple state machines at any rate, one per client and
>> one for the listening socket. But in that case, each state machine would
>> be linked to one specific socket; the "server / listener" state machine
>> that you're talking about above is just linked to the "main" server
>> socket, whereas the "device" ones are linked to individual client's
>> sockets. In theory, that could be implemented with two different state
>> machine implementations, but I'm not sure we actually need that.
> 
> Ok.
> 
> And yes, the generic state machine implementation could be the same.
> Just the data used to create the machine would differ.

Or it could be the same state machine, just at a different state. It
doesn't matter all that much at this point.

>> What I meant with a state machine linked to a request is that we'd have
>> some piece of code which loops over select() calls, and reads in data
>> from a socket. If that looks like a request, it creates a *new* state
>> machine just for that request, ensures it has all the data that it needs
>> (which may involve reading more from the socket than just the request
>> header), and starts it. This state machine is then responsible for
>> handling all the "figure out where data is", "read data from disk", and
>> "write data back to socket" function calls. Doing so would make it much
>> easier for us to handle requests in parallel (if we're not finished
>> reading from a particular backend file yet, we can just go do something
>> else in the mean time).
> 
> But wouldn't that all be the server implementation?

Yes, well, I want to move almost everything from the server into the
library. Pretty much the only things that would stay would be main()
(for obvious reasons), command line parsing, and parse_cfile().
Everything else would move to the library.

> I don't see where
> that would interact with the libnbd at all. The libnbd hands the
> server code the request after it was received. The server then does
> whatever with the request and at some points returns it to the libnbd
> for sending back. Any state changes in the request inbetween those two
> would be by the server and for the server.

That would result in a much lighter library. While fine, it's not where
I want to take this.

> The server could use a state machine or threads or do one request at a
> time or something else. I don't think the libnbd should dictate what
> that is.

Oh, certainly not. I want to make it so that you can completely ignore
the state machine if you choose; all the functions that are usually
called by the state machine would be exported to library users.

But there's no reason why everyone should have to write their own main
loop, if they're not interested.

> If the server uses a state machine then the alloc_request() callback
> would create a new one. It could be declared as
> 
> struct my_state_machine {
>   struct nbd_request; // must be first
>   int my_state;
>   void * foo;
> };
> 
> The libnbd would then will in the struct nbd_request and eventually
> call handle_request(). That then starts the state machine.

I'm not sure I understand what you're on about here. Can you elaborate?

> libnbd could provide a default server implementation that is state
> machine based like that.
> 
> The idea should be that libnbd provides a number of parts that you can
> put together to make an NBD server.

Yes, that's the idea.

> And you can replace some of the
> parts with your own implementation. So that one can easily extend or
> replace existing parts.
> 
> My idea would be more obejct oriented. If you don't want the default
> write function you supply your own. If you want to extend it your own
> implementation can call the original function inbetween doing its own
> thing.
> 
> Your idea seem to be that you reprogramm the state machine to include
> new states. Maybe it is just a different way of saying the same.

Probably :-)

[...]
> I just wouldn't require that requests are handled as state machine.
> The interface between receiving and sending requests and handling them
> don't interact much. It's just create, handle, reply, free. 4 little
> functions that will be called in that order.

Well, here's the thing: "handle" in current nbd-server consists of:
- Check the type of the request (read/write/trim/flush/disconnect)
- read data from the socket, if this is a write request
- figure out if we're using copy-on-write; if so, redirect the relevant
parts of the request to the diff file
- figure out if we're using the multiple file option; if so, distribute
the request to the relevant files from our "multiple" export.
- Read data from the correct file into a buffer

I suppose the "write data to socket" part would go into the "reply" above.

That's a lot of stuff to do for a "little" function. The current code
just has the function which implements the first bullet point in the
above list call the next function. That works, I suppose, but it's
extremely non-flexible.

If we're going to allow people to replace (or disable?) some of that
functionality, they can't be directly calling eachother anymore. I think
a state machine there is the most flexible way to make that possible
(although I could certainly be wrong about that).

> So I don't think it would
> be hard to make this flexible enough to allow any kind of
> implementation of the handling.

Well, that's the goal -- the big question is how to do that :-)

>>> -----------------------------------------------------------------------
>>>
>>> - connecter
>>>
>>> This is the other side for the server / listener and handles feature
>>> negotiation with the server. After negotiation a client "state
>>> machine" is created.
>>>
>>> - client
>>>
>>> This is handles sending requests to the server and recieving replies
>>> for a running client connection. A client has a queue for requests and
>>> callbacks for events (request finsihed / failed).
>>>
>>> -----------------------------------------------------------------------
>>>
>>> I'm not sure a state machine is the right thing for device / client
>>> for the user.
>>
>> I'm not saying I will certainly expose the state machine. But to do
>> things right, I'll need some better way of keeping state than what we're
>> doing now, and doing that right requires some design. That's what I'm
>> trying to do here :-)
>>
>> A proper state machine should give us the required flexibility to allow
>> calls that would do something along the lines of "call this function
>> after reading from the socket but before writing to disk, so I can
>> implement my own copy-on-write scheme" or some such. Such a call should
>> indeed not need to know about the internals of the whole system, it
>> should just be something along the lines of a
>> "nbd_add_hook_before(handle, NBD_WRITE, funcptr)", where handle could be
>> something that points to a specific export, the system as a whole, or
>> something in between.
> 
> Or
> 
> struct nbd_request_ops ops = {
>   .data_read = nbd_default_ops.data_read;
>   .data_write = cow_write;

That can't work. The "default" read would read from the original file,
the write would write to the diff file. The client will now ask for its
data, and get the original data back.

>   ...
> };
> 
> void cow_write(...) {
>   if (need_cow) {
>     // do COW
>   }
>   nbd_default_ops.data_write(...);
> }

So now we're back to "first function calls the next". Makes it pretty
inflexible.

> I'm pretty sure hook_before/after isn't enough. A
> nbd_replace_hook(handle, NBD_WRITE, funcptr) would be needed too.

The idea would be that the COW function wouldn't do the actual reading
or writing; it would only change data in the request so that the next
function, which does the actual writing, redirects its writes to the
copy-on-write file. Additionally, it would change data so that
subsequent reads would *get* the data from the copy-on-write file if
we're trying to read from a region that was written to before in this
session, but get it from the original file if not.

[...]

-- 
Copyshops should do vouchers. So that next time some bureaucracy
requires you to mail a form in triplicate, you can mail it just once,
add a voucher, and save on postage.



Reply to: