[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: nfs4 mount options rsize wsize



Hello,

On Mon, May 20, 2013 at 09:45:35AM +0200, Giorgio Pioda wrote:
> On Mon, May 20, 2013 at 08:44:18AM +0200, Petter Reinholdtsen wrote:
> > [Andreas B. Mundt]
> > > Running a test without defining rsize,wsize on 3 different setups, I
> > > got the following (remove rsize,wsize in LDAP and check with 'mount'
> > > after mounting the directory):
> > > 
> > >     virtual machine setup:   rsize=wsize=131072
> > >     real hardware 1      :   rsize=wsize=262144
> > >     real hardware 2      :   rsize=wsize=524288
> > > 
> > > All values are considerably larger than the values defined manually.
> > > It would be nice to understand the reasons why such a small value
> > > has been chosen in debian-edu.
> > 
> > It was choosen because it has been the recommende _large_ value for
> > NFS mounts for probably 20 years. :) The default was very small, 1k or
> > 2k if I remember correctly, so we use a larger value to increase the
> > throughput without causing too much fragmentation in each NFS package.
> > The last part is most relevant when using UDP based NFS.
> > 
> > Very interesting to see how the default seem to have changed.  Sound
> > to me like we can drop the setting now.

We have been running NFS over WLAN, and experienced problems that turned
out to be related to bufferbloat
(http://en.wikipedia.org/wiki/Bufferbloat) in combination with low
bandwidth.

In general, with a single thread, large buffers for read and write are
good, and this is why most sidtros use rather large chunks to-be-read or
to-be-written at once.

The situation changes completely when you have many concurrent reads and
writes on a shared medium. Requests and data in a buffer will get
delayed until all read and write operations in the queue before are
finished, and this led to the fact that after 10-20 students logged in,
Desktops froze and even connection to the LDAP database was not possible
anymore. Everything happened but productive work: Programs terminated
with timeouts randomly, the desktop was loaded incomplete, students were
logged out spontaneously. In most cases, it wasn't possible to teach
anymore, because successfully login already consumed half of the lessons
time.

The solution after many tests was at first somewhat surprising: We
reduced rsize and wsize to a very small value (4096), and set mount
options to "sync", which is known to be very slow on local file systems,
but resulted in a big performance boost when used on NFS. After the
changes, Bandwidth was equally shared amongst all clients with no more
timeouts and sudden logouts.

While a single workstation surely has a somewhat lower data throughput,
the entire class of 20+ Desktops connected as NFS clients was
operational again.

>From this experience, we created a HOWTO which you can still find at
https://rp.skolelinux.de/rlp-wiki/bin/view/RlpSkolelinuxPublic/NetworkPerformanceTuning
(I sent this link before in a different context).

Also, we used a local NFS cache (mount option fsc) which is only
possible with new kernels and xattr file system support. This option
lowers network bandwidth peaks somewhat when reading parts of files that
were just written from a client. But the "sync" option and smaller rsize
and wsize were actually the client options that gained the biggest
performance boost.

Regards
-Klaus


Reply to: