Hi
Steven,
many thanks for you info. On 04/12/2013 11:41 PM, Steven
Chamberlain wrote:
Yes, however it would help me to see the io activities of the root disk (SSD). [...] the rc "nfsd start" refuses to start because the nfsd is already running.The rc initscript started it both times. Once from rcS.d and again from rc2.d - that is a known bug #700245 but the fix hasn't been accepted into wheezy. I thought the error message is harmless - nfsd should still be running after that error? And it should always be using the options from /etc/default/nfsd? Yes, you are right, it starts all the daemons from the rc with all the options correctly - it was my wrong presumption why the nfs server did not start correctly. And it really does not in a default setup. I made a clean reboot and compared the running nfs-related daemons: before the reboot (correct behavior): root 7592 1 0 Apr10 ? 00:00:02 /sbin/rpcbind -w root 6918 1 0 Apr10 ? 00:00:00 /usr/sbin/rpc.statd root 6877 1 0 Apr10 ? 00:00:00 /usr/sbin/rpc.lockd root 6949 6948 0 Apr10 ? 00:04:10 /usr/sbin/nfsd -o -t -u root 6948 1 0 Apr10 ? 00:00:00 /usr/sbin/nfsd -o -t -u root 7516 1 0 Apr10 ? 00:00:02 /usr/sbin/mountd after the reboot (wrong behavior - cannot mount on clients): root 1870 1 0 00:49 ? 00:00:00 /usr/sbin/rpc.statd root 1587 1 0 00:49 ? 00:00:00 /usr/sbin/rpc.lockd root 1560 1 0 00:49 ? 00:00:00 /usr/sbin/rpc.statd root 1384 1 0 00:49 ? 00:00:00 /sbin/rpcbind -w root 1615 1614 0 00:49 ? 00:00:00 /usr/sbin/nfsd -o -t -u root 1614 1 0 00:49 ? 00:00:00 /usr/sbin/nfsd -o -t -u root 1506 1 0 00:49 ? 00:00:00 /usr/sbin/mountd This looks ok, however rpcinfo output is more important in this case: before the reboot (correct behavior): # rpcinfo |grep "tcp " 100000 4 tcp 0.0.0.0.0.111 portmapper superuser 100000 3 tcp 0.0.0.0.0.111 portmapper superuser 100000 2 tcp 0.0.0.0.0.111 portmapper superuser 100021 0 tcp 0.0.0.0.2.186 nlockmgr superuser 100021 1 tcp 0.0.0.0.2.186 nlockmgr superuser 100021 3 tcp 0.0.0.0.2.186 nlockmgr superuser 100021 4 tcp 0.0.0.0.2.186 nlockmgr superuser 100024 1 tcp 0.0.0.0.2.240 status superuser 100003 2 tcp 0.0.0.0.8.1 nfs superuser 100003 3 tcp 0.0.0.0.8.1 nfs superuser 100005 1 tcp 0.0.0.0.3.191 mountd superuser 100005 3 tcp 0.0.0.0.3.191 mountd superuser after the reboot: # rpcinfo |grep "tcp " 100000 4 tcp 0.0.0.0.0.111 portmapper superuser 100000 3 tcp 0.0.0.0.0.111 portmapper superuser 100000 2 tcp 0.0.0.0.0.111 portmapper superuser 100005 1 tcp 0.0.0.0.3.205 mountd superuser 100005 3 tcp 0.0.0.0.3.205 mountd superuser 100024 1 tcp 0.0.0.0.3.60 status superuser I tried to install bootlogd to get all the boot messages, but got: bootlogd: ioctl(/dev/ttyp0, TIOCCONS): Bad address Nevertheless, I realized that following daemons/services need to be restarted after every reboot to make nfs server fully working: /etc/init.d/nfsd restart /etc/init.d/mountd restart /etc/init.d/rpc.lockd restart So I assumed that most probably the order of the daemon starting during booting procedure is just not ideal. Therefore reviewed the rc scripts and have found a slight logical dependency issue. I made these changes: rpc.statd -> add "rpcbind" into "Required-Start:" mountd -> add "rpcbind" into "Required-Start:" Another issue was that mountd was started before zfs mounts were mounted. So I added +zfs into $local_fs (/etc/insserv.conf) following by insserv, but it did not help - zfs still did not occurred in /etc/init.d/.depend.boot. I needed to change /etc/init.d/zfs: < # Provides: zvol zfs > # Provides: zfs zvol Do not fully understand why it did not work without this change (bug in insserv?), however it was necessary for me to correctly generate the dependencies in the .depend.boot. However, it still did not work after a reboot. I found that starting rc scripts rpc.lockd, rpc.statd, and nfsd led to calling the corresponding daemon binary regardless it is already running or not. Unfortunately, the second run initiated from rc2.d unregistered the service (nfs, lockmgr) from rpcbind! And this was the main problem. There are tests in the rc scripts checking whether the daemon is already running or not, e.g.: start-stop-daemon --start --quiet --pidfile /var/run/nfsd.pid --exec /usr/sbin/nfsd --test however there is no such pid file there, therefore the test fails and allows to start it again in a following line: start-stop-daemon --start --quiet --pidfile /var/run/nfsd.pid --exec /usr/sbin/nfsd -- -o -t -u So I amended the test lines in those 3 rc scripts to: < start-stop-daemon --start --quiet --pidfile $PIDFILE --exec $DAEMON --test > /dev/null \ > start-stop-daemon --start --quiet --exec $DAEMON --test > /dev/null \ After another reboot, everything works fine. So 2 main changes were done to make the nfs server working with zfs: 1. define correct mountd dependencies during boot 2. prevent starting nfs related daemons twice Regards Vaclav |