[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

[Nbd] transparent handling of nbd reconnections at kernel level



( This is my firt post to this list, excuses for my poor english and/or
improper/unkowun rules )

Hi, all

I'm a trying to create a NBD server cluster by mean of several servers
(same image on every one :-) and keepalived daemon for virtual server ip
and load balancing

Everything works fine: connection works, server failover works, load
balancing works.... but nbd-client fails on reconnect to new nbd server

I use Ubuntu-12.04 with latest Ubuntu kernel (3.8.21) and nbd server and
client from git (version 3.3).

Trying to isolate problem I've used this scenario:
* NBD server
* nbd client launched with cmdline:
 jantonio$ sudo /sbin/nbd-client -N ltsp_i386 nbdserver /dev/nbd0 -t 6
-persist -nofork
* On the client
mount -r -t squashfs /dev/nbd0 /mnt/nbd

by issuing "service nbd-server stop" and then restart I can see that
nbd-client detect server fails and reconnect w/o problems:

........
jantonio@...1346...:~$  sudo /sbin/nbd-client -N ltsp_i386
binubuntu2 /dev/nbd0 -t 6 -persist -nofork
Negotiation: ..size = 5953MB
bs=1024, sz=6243049472 bytes
timeout=6
nbd,4768: Kernel call returned: 32 Reconnecting
Error: Socket failed: Connection refused
Exiting.
 Reconnecting
[...]
Error: Socket failed: Connection refused
Exiting.
 Reconnecting
Negotiation: ..size = 5953MB
bs=1024, sz=6243049472 bytes
timeout=6
...............

But mounted squashfs fails if during the reconnection process I try
to do perform any operation (eg: "ls /mnt/nbd").

I've tested also by mean of "dd" instead "mount": as soon as server
socket closes, dd aborts regardless of "conv=noerror" dd option, instead
of waitting for reconnect

In the first test, mounted filesystem becomes no longer usable; in the
later dd aborts, so cannot complete operation. In both cases, nbd-client
detect lose of connection and successfully reconnect, but kernel module
just closes and becomes no longer available if try to use it in the
reconnection meantime

So my question:
Is there any way to get nbd kernel module waiting for server client
execute finnish_sock() routine to tell the new socket to talk to,
instead of inmediate return of ioerror ?

Perhaps a new ioctl() or nbd_flag option to say kernel that client is in
"persist mode" and wait instead of return?

Thanks in advance
Juan Antonio





Reply to: