Re: [Nbd] nbd-client and SIGSTOP/SIGCONT
On Fri, Apr 15, 2011 at 5:50 PM, Ian <coughlan@...866...> wrote:
> However, when I want to shut it down, things get messy. The shutdown
> scripts call killall5, and I have populated the appropriate directory with
> the correct "omit" pids, and I see that killall5 doesn't try to kill either
> the server or the clients. But the nbd-client for the mounted filesystem
> dies anyway, which means the subsequent file IO is broken, and the remaining
> scripts are not executed. I have added debug to nbd-client, and the
> ioctl(nbd, NBD_DO_IT) call is returning after running killall5. Then I
> wrote a simple test file that doesn't kill any processes at all, but does
> the signal(-1, SIGSTOP) followed by signal(-1, SIGCONT), just as killall5
> does, and the nbd-client still dies. Delving deeper into the nbd kernel
> driver, the wait_event_interruptible() call in nbd_find_request() is
> returning -ERESTARTSYS after the SIGSTOP/SIGCONT sequence has been run.
> This return value is returned from nbd_ioctl() as it should be. I would
> think that the ioctl(nbd, NBD_DO_IT) would be restarted, but the ioctl()
> still returns to the nbd-client. And, of course, the socket and nbd_thread
> have been shutdown by the time nbd_ioctl() returns.
Ah, I see...what errno does nbd-client see? Is it ERESTARTSYS or EINTR?
I think we need to quit the ioctl without shutting everything down in
the SIGSTOP case...
$ git diff drivers/block/nbd.c
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index e6fc716..0c426a4 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -667,6 +667,10 @@ static int __nbd_ioctl(struct block_device *bdev, struct nb
kthread_stop(thread);
mutex_lock(&lo->tx_lock);
+ /* if SIGSTOP occurred, exit here and allow nbd-client to retry
+ the nbd_do_it ioctl */
+ if (lo->harderror == -ERESTARTSYS)
+ return -ERESTARTSYS;
if (error)
return error;
sock_shutdown(lo, 0);
I think with this patch and a restart of the ioctl by userland
nbd-client, we could retry and it would work...
Is it possible for you to try this patch? I'm not sure if you also
have to patch nbd-client or if the kernel is retrying the ioctl for
you automatically?
The long-term answer might be to make nbd_do_it its own kernel thread
so this SIGSTOP problem is avoided, but that's a major change to how
nbd works.
--
Paul
Reply to: