Bug#364607: marked as done (horrible nfs read performance)

To: Ben Hutchings <ben@decadent.org.uk>
Subject: Bug#364607: marked as done (horrible nfs read performance)
From: owner@bugs.debian.org (Debian Bug Tracking System)
Date: Wed, 26 Aug 2009 19:03:08 +0000
Message-id: <[🔎] handler.364607.D364607.125131325716713.ackdone@bugs.debian.org>
References: <1251313254.4429.65.camel@localhost> <1145885694.9181.97.camel@localhost>

Your message dated Wed, 26 Aug 2009 20:00:54 +0100
with message-id <1251313254.4429.65.camel@localhost>
and subject line Re: horrible nfs read performance
has caused the Debian Bug report #364607,
regarding horrible nfs read performance
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@bugs.debian.org
immediately.)


-- 
364607: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=364607
Debian Bug Tracking System
Contact owner@bugs.debian.org with problems

--- Begin Message ---

To: submit@bugs.debian.org
Subject: horrible nfs read performance
From: Arthur de Jong <arthur@west.nl>
Date: Mon, 24 Apr 2006 15:34:54 +0200
Message-id: <1145885694.9181.97.camel@localhost>

Subject: nfs-common: horrible nfs read performance
Package: nfs-common
Version: 1:1.0.7-3
Severity: important

If this is a kernel problem rather than an nfs-common problem, please
reassign this bug.

There is a performance problem in the NFS client code when doing reads
over UDP. The problem came up with a hardware problem on a network
adapter. The adapter is 1Gbit but we needed to scale it down to 100Mbit
on the switch. After some investigation it turned out that read
performance over NFS was horrrible (NFS writes were ok and raw TCP
traffic scaled down as expected). We have been able to reproduce this
problem on other machines with a different make Gbit card.

Test system overview:
        OS              kernel            nfs-common HW
 ------ --------------- ----------------- ---------- -------------------
 host1  Debian/testing  2.6.15-1-686-smp  1:1.0.7-3  Dell Optiplex SX280
 host2  Debian/testing  2.6.15-1-686-smp  1:1.0.7-3  Dell Optiplex SX280
 host3  Debian/testing  2.6.14-local-p4   1:1.0.7-9  Dell Optiplex SX280
 host4  Debian/testing  2.6.15-1-686-smp  1:1.0.7-3  Dell Optiplex SX270
 host5  Solaris 8                                    Sun Netra t1
 ------ --------------- ----------------- ---------- -------------------

host3 has the initial problems, a downgrade of the kernel was done there
an attempt to fix the problems as the issue was thought to be related to
http://www.ussg.iu.edu/hypermail/linux/kernel/0603.3/2368.html and
http://www.ussg.iu.edu/hypermail/linux/kernel/0604.0/0381.html. The
SX280 have a Broadcom Corporation NetXtreme BCM5751 Gigabit Ethernet PCI
Express controller, the SX270 have a Intel Corporation 82540EM Gigabit
Ethernet Controller.

The server is a Debian/stable system running a 2.4.32 kernel with
nfs-kernel-server 1:1.0.6-3.1.

The network was in normal use and not attempts were made to flush
buffers etc, so these are not 100% clean results but should be accurate
enough to show the problem.

The results:
 host  speed discard   nfs  nfs write  nfs read
      Mbit/s    sec.  prot       sec.      sec.
 ----- ----- ------- ----- ---------- ---------
 host1  1000       9   udp         33         0
         100      92   udp        101       151
 host2   100      92   udp        101       151
         100           tcp        101         3
 host3   100      91   udp        113        95
         100           tcp        111         3
 host4  1000      12   udp         30         0
         100      91   udp        100       192
        1000           tcp         31         2
 host5   100           tcp        133         2
 ----- ----- ------- ----- ---------- ---------

The discard test writes 1 GByte of data with netcat to the discard port
of the NFS server:
  dd if=/dev/zero bs=1024k count=1024 | nc -q 0 oostc discard
The nfs write test writes 1 GByte of data to a mountpoint:
  time dd if=/dev/zero of=/mnt/tmpfile bs=8k count=131072
The nfs read test reads a file of 30 MByte:
  time dd of=/dev/null if=/mnt/tmpfile bs=8k count=3840

>From the tests can be seen:
- discard performance goes down from about 100 MByte/s to 11 MByte/s
  with a network downgrade to 100 Mbit (no surprise here)
- NFS write performance goes down from 33 MByte/s to 10 MByte/s
  with the same network downgrade (also no surprise)
- Linux NFS write performance is similar to Solaris (also as expected)
- NFS read performance (over UDP) goes down from about 100+ MByte/s to
  0.2 MByte/s with a network downgrade, while Solaris stays at 15
  MByte/s (unexpected and a problem!!!)
- NFS read performance over TCP is as can be expected and does not
  suffer from the bad performance

We have switched to NFS over TCP and are currently evaluating it (we're
happy with that solution for now).

-- System Information:
Debian Release: testing/unstable
  APT prefers testing
  APT policy: (500, 'testing'), (50, 'unstable')
Architecture: i386 (i686)
Shell:  /bin/sh linked to /bin/bash
Kernel: Linux 2.6.15-1-686-smp
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_GB (charmap=ISO-8859-1)

Versions of packages nfs-common depends on:
ii  debconf       1.4.72                     Debian configuration management sy
ii  libc6         2.3.6-3                    GNU C Library: Shared libraries an
ii  libcomerr2    1.38+1.39-WIP-2005.12.31-1 common error description library
ii  libevent1     1.1a-1                     An asynchronous event notification
ii  libkrb53      1.4.3-6                    MIT Kerberos runtime libraries
ii  libnfsidmap1  0.13-1                     An nfs idmapping library
ii  libwrap0      7.6.dbs-9                  Wietse Venema's TCP wrappers libra
ii  portmap       5-18                       The RPC portmapper
ii  sysvinit      2.86.ds1-13                System-V-like init utilities

nfs-common recommends no packages.

-- no debconf information

-- 
-- arthur de jong - arthur@west.nl - west consulting b.v. --

--- End Message ---

--- Begin Message ---

To: 364607-done@bugs.debian.org
Subject: Re: horrible nfs read performance
From: Ben Hutchings <ben@decadent.org.uk>
Date: Wed, 26 Aug 2009 20:00:54 +0100
Message-id: <1251313254.4429.65.camel@localhost>
In-reply-to: <[🔎] 1251274721.3964.10.camel@luik>
References: <1244305515.21215.73.camel@deadeye> <[🔎] 20090813211337.GA19817@galadriel.inutil.org> <[🔎] 1251274721.3964.10.camel@luik>

On Wed, 2009-08-26 at 10:18 +0200, Arthur de Jong wrote:
> On Thu, 2009-08-13 at 23:13 +0200, Moritz Muehlenhoff wrote:
> > On Sat, Jun 06, 2009 at 05:25:15PM +0100, Ben Hutchings wrote:
> > > Using TCP for NFS is the default and is generally recommended.
> > > 
> > > The problem you originally reported involved poorer performance for the
> > > clients using a gigabit link than those using a 100-megabit link
> > > <http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=364607#5>.  Based on
> > > your latest results
> > > <http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=364607#32>, this
> > > appears to have been fixed.
> > > 
> > > So, if you still think you are seeing "horrible read performance", what
> > > are you comparing with?
> > 
> > Arthur, what did you compare against?
> 
> The server in the first test was a Debian (lenny?) server running a
> 2.4.32 kernel with nfs-kernel-server 1:1.0.6-3.1. The latest test was
> done with Debian/etch as server with 2.6.18-6-686 and nfs-common
> 1:1.0.10-6+etch.1.
> 
> In the latest test I didn't put any of the machines back to 1Gbit but
> for host2 and host3 read performance over UDP is considerably slower
> than over TCP (both hosts have Gbit interfaces but are patched at
> 100Mbit).

Poorer NFS performance over UDP is expected and is not a bug.  I don't
believe there is any good reason to use NFS over UDP today, and the
Linux NFS client uses TCP by default.

Ben.

-- 
Ben Hutchings
If at first you don't succeed, you're doing about average.

Attachment: signature.asc
Description: This is a digitally signed message part

--- End Message ---

Reply to:

Prev by Date: Bug#501118: linux-image-2.6.26-1-686: Thinkpad i1300/1310: kernel panic on boot
Next by Date: Processed: This bug affects the version currently in testing too, right?
Previous by thread: Bug#543452: repeated names in each /dev/disk/by-id/ata-*
Next by thread: Processed: This bug affects the version currently in testing too, right?
Index(es):
- Date
- Thread