[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: kirkwood mv643xx_eth wrong tcp checksums



On Wed, Nov 25, 2009 at 08:18:46PM +0100, David Fröhlich wrote:

> > The general idea here would be to do this in mv643xx_eth.c:
> > -               if (unlikely(tag_bytes & ~12)) {
> > +               if (unlikely(tag_bytes & ~12) || skb->len > MAGIC_VALUE) {
> > 
> > If you set MAGIC_VALUE to 1514, regular MTU should still work, and
> > use with hw checksum offload, and jumbo MTU should start to work,
> > albeit with sw checksumming.
> 
> Instead of the latest 2.6.32rc8 kernel I have now compiled the 2.6.30
> kernel that came with debian squeeze with your workaround and it works!

Odd.  Let me figure out why on 2.6.32-rc it isn't working.


> It is now possible to set any mtu between 1500 and 9000 without any
> problems. I performed some tcp benchmarks (transmit from qnap arm device
> to ubuntu x64 server) and here are the results:
> 
> nas:~# ifconfig eth0 mtu 1500
> nas:~# iperf -N -c server.lan.froehlich.ws -t 60
> ------------------------------------------------------------
> Client connecting to server.lan.froehlich.ws, TCP port 5001
> TCP window size:   148 KByte (default)
> ------------------------------------------------------------
> [  3] local 172.24.0.70 port 41403 connected with 172.24.0.71 port 5001
> [ ID] Interval       Transfer     Bandwidth
> [  3]  0.0-60.0 sec  3.26 GBytes    466 Mbits/sec
> nas:~# ifconfig eth0 mtu 9000
> nas:~# iperf -N -c server.lan.froehlich.ws -t 60
> ------------------------------------------------------------
> Client connecting to server.lan.froehlich.ws, TCP port 5001
> TCP window size: 73.9 KByte (default)
> ------------------------------------------------------------
> [  3] local 172.24.0.70 port 56614 connected with 172.24.0.71 port 5001
> [ ID] Interval       Transfer     Bandwidth
> [  3]  0.0-60.0 sec  5.78 GBytes    827 Mbits/sec
> 
> 
> As you can see tcp throughput has almost doubled when using jumbo
> frames; even under the circumstances that software instead of hardware
> does now calculate all the tcp checksums.
> 
> 
> output of top, 9000 mtu
> 
> top - 20:11:14 up  1:05,  2 users,  load average: 1.13, 0.36, 0.17
> Tasks:  91 total,   1 running,  90 sleeping,   0 stopped,   0 zombie
> Cpu(s):  1.7%us, 38.6%sy,  0.0%ni,  0.7%id,  0.0%wa,  2.0%hi, 57.1%si,
> Mem:    515924k total,   209248k used,   306676k free,    57092k buffers
> Swap:  1646620k total,        0k used,  1646620k free,    82104k cached
> 
> and with 1500 mtu
> 
> top - 20:12:36 up  1:07,  2 users,  load average: 0.55, 0.33, 0.17
> Tasks:  91 total,   1 running,  90 sleeping,   0 stopped,   0 zombie
> Cpu(s):  4.0%us, 24.8%sy,  0.0%ni,  0.0%id,  0.0%wa,  5.0%hi, 66.2%si,
> Mem:    515924k total,   223844k used,   292080k free,    57188k buffers
> Swap:  1646620k total,        0k used,  1646620k free,    98436k cached
> 
> 
> Seems as if the systems is under full load even when using hardware
> checksumming. So the hardware checksum engine does not really have any
> advantages compared to sw crc and 9k mtu.

This might be because since you're not using zero-copy, the data has
to be copied at least once (from userspace to kernel buffers) anyway,
so by the time we call skb_checksum_help() the data is still in the
L2 cache, or something like that.


Reply to: