Re: Wheezy experience

Hi Steven,
many thanks for you info.

On 04/12/2013 11:41 PM, Steven Chamberlain wrote:

* iostat (dstat, etc.) does not show any disk activity. Since the main

On ZFS maybe 'zpool iostat' helps.

Yes, however it would help me to see the io activities of the root disk (SSD).

[...] the rc "nfsd
start" refuses to start because the nfsd is already running.

The rc initscript started it both times.  Once from rcS.d and again from
rc2.d - that is a known bug #700245 but the fix hasn't been accepted
into wheezy.

I thought the error message is harmless - nfsd should still be running
after that error?

And it should always be using the options from /etc/default/nfsd?

Yes, you are right, it starts all the daemons from the rc with all the options correctly - it was my wrong presumption why the nfs server did not start correctly. And it really does not in a default setup. I made a clean reboot and compared the running nfs-related daemons:

before the reboot (correct behavior):
root      7592     1 0 Apr10 ?        00:00:02 /sbin/rpcbind -w
root      6918     1 0 Apr10 ?        00:00:00 /usr/sbin/rpc.statd
root      6877     1 0 Apr10 ?        00:00:00 /usr/sbin/rpc.lockd
root      6949 6948 0 Apr10 ?        00:04:10 /usr/sbin/nfsd -o -t -u
root      6948     1 0 Apr10 ?        00:00:00 /usr/sbin/nfsd -o -t -u
root      7516     1 0 Apr10 ?        00:00:02 /usr/sbin/mountd

after the reboot (wrong behavior - cannot mount on clients):
root      1870     1 0 00:49 ?        00:00:00 /usr/sbin/rpc.statd
root      1587     1 0 00:49 ?        00:00:00 /usr/sbin/rpc.lockd
root      1560     1 0 00:49 ?        00:00:00 /usr/sbin/rpc.statd
root      1384     1 0 00:49 ?        00:00:00 /sbin/rpcbind -w
root      1615 1614 0 00:49 ?        00:00:00 /usr/sbin/nfsd -o -t -u
root      1614     1 0 00:49 ?        00:00:00 /usr/sbin/nfsd -o -t -u
root      1506     1 0 00:49 ?        00:00:00 /usr/sbin/mountd

This looks ok, however rpcinfo output is more important in this case:

before the reboot (correct behavior):
# rpcinfo |grep "tcp "
    100000    4    tcp       0.0.0.0.0.111          portmapper superuser
    100000    3    tcp       0.0.0.0.0.111          portmapper superuser
    100000    2    tcp       0.0.0.0.0.111          portmapper superuser
    100021    0    tcp       0.0.0.0.2.186          nlockmgr   superuser
    100021    1    tcp       0.0.0.0.2.186          nlockmgr   superuser
    100021    3    tcp       0.0.0.0.2.186          nlockmgr   superuser
    100021    4    tcp       0.0.0.0.2.186          nlockmgr   superuser
    100024    1    tcp       0.0.0.0.2.240          status     superuser
    100003    2    tcp       0.0.0.0.8.1            nfs        superuser
    100003    3    tcp       0.0.0.0.8.1            nfs        superuser
    100005    1    tcp       0.0.0.0.3.191          mountd     superuser
    100005    3    tcp       0.0.0.0.3.191          mountd     superuser

after the reboot:
# rpcinfo |grep "tcp "
    100000    4    tcp       0.0.0.0.0.111          portmapper superuser
    100000    3    tcp       0.0.0.0.0.111          portmapper superuser
    100000    2    tcp       0.0.0.0.0.111          portmapper superuser
    100005    1    tcp       0.0.0.0.3.205          mountd     superuser
    100005    3    tcp       0.0.0.0.3.205          mountd     superuser
    100024    1    tcp       0.0.0.0.3.60           status     superuser

I tried to install bootlogd to get all the boot messages, but got:
bootlogd: ioctl(/dev/ttyp0, TIOCCONS): Bad address

Nevertheless, I realized that following daemons/services need to be restarted after every reboot to make nfs server fully working:

/etc/init.d/nfsd restart
/etc/init.d/mountd restart
/etc/init.d/rpc.lockd restart

So I assumed that most probably the order of the daemon starting during booting procedure is just not ideal. Therefore reviewed the rc scripts and have found a slight logical dependency issue. I made these changes:

rpc.statd -> add "rpcbind" into "Required-Start:"
mountd -> add "rpcbind" into "Required-Start:"

Another issue was that mountd was started before zfs mounts were mounted. So I added +zfs into $local_fs (/etc/insserv.conf) following by insserv, but it did not help - zfs still did not occurred in /etc/init.d/.depend.boot. I needed to change /etc/init.d/zfs:

< # Provides:          zvol zfs
> # Provides:          zfs zvol

Do not fully understand why it did not work without this change (bug in insserv?), however it was necessary for me to correctly generate the dependencies in the .depend.boot.

However, it still did not work after a reboot. I found that starting rc scripts rpc.lockd, rpc.statd, and nfsd led to calling the corresponding daemon binary regardless it is already running or not. Unfortunately, the second run initiated from rc2.d unregistered the service (nfs, lockmgr) from rpcbind! And this was the main problem.

There are tests in the rc scripts checking whether the daemon is already running or not, e.g.:

start-stop-daemon --start --quiet --pidfile /var/run/nfsd.pid --exec /usr/sbin/nfsd --test

however there is no such pid file there, therefore the test fails and allows to start it again in a following line:

start-stop-daemon --start --quiet --pidfile /var/run/nfsd.pid --exec /usr/sbin/nfsd -- -o -t -u

So I amended the test lines in those 3 rc scripts to:

< start-stop-daemon --start --quiet --pidfile $PIDFILE --exec $DAEMON --test > /dev/null \
> start-stop-daemon --start --quiet --exec $DAEMON --test > /dev/null \

After another reboot, everything works fine. So 2 main changes were done to make the nfs server working with zfs:

1. define correct mountd dependencies during boot
2. prevent starting nfs related daemons twice

Regards
Vaclav