Re: kirkwood mv643xx_eth wrong tcp checksums
On Wed, Nov 25, 2009 at 08:18:46PM +0100, David Fröhlich wrote:
> > The general idea here would be to do this in mv643xx_eth.c:
> > - if (unlikely(tag_bytes & ~12)) {
> > + if (unlikely(tag_bytes & ~12) || skb->len > MAGIC_VALUE) {
> >
> > If you set MAGIC_VALUE to 1514, regular MTU should still work, and
> > use with hw checksum offload, and jumbo MTU should start to work,
> > albeit with sw checksumming.
>
> Instead of the latest 2.6.32rc8 kernel I have now compiled the 2.6.30
> kernel that came with debian squeeze with your workaround and it works!
Odd. Let me figure out why on 2.6.32-rc it isn't working.
> It is now possible to set any mtu between 1500 and 9000 without any
> problems. I performed some tcp benchmarks (transmit from qnap arm device
> to ubuntu x64 server) and here are the results:
>
> nas:~# ifconfig eth0 mtu 1500
> nas:~# iperf -N -c server.lan.froehlich.ws -t 60
> ------------------------------------------------------------
> Client connecting to server.lan.froehlich.ws, TCP port 5001
> TCP window size: 148 KByte (default)
> ------------------------------------------------------------
> [ 3] local 172.24.0.70 port 41403 connected with 172.24.0.71 port 5001
> [ ID] Interval Transfer Bandwidth
> [ 3] 0.0-60.0 sec 3.26 GBytes 466 Mbits/sec
> nas:~# ifconfig eth0 mtu 9000
> nas:~# iperf -N -c server.lan.froehlich.ws -t 60
> ------------------------------------------------------------
> Client connecting to server.lan.froehlich.ws, TCP port 5001
> TCP window size: 73.9 KByte (default)
> ------------------------------------------------------------
> [ 3] local 172.24.0.70 port 56614 connected with 172.24.0.71 port 5001
> [ ID] Interval Transfer Bandwidth
> [ 3] 0.0-60.0 sec 5.78 GBytes 827 Mbits/sec
>
>
> As you can see tcp throughput has almost doubled when using jumbo
> frames; even under the circumstances that software instead of hardware
> does now calculate all the tcp checksums.
>
>
> output of top, 9000 mtu
>
> top - 20:11:14 up 1:05, 2 users, load average: 1.13, 0.36, 0.17
> Tasks: 91 total, 1 running, 90 sleeping, 0 stopped, 0 zombie
> Cpu(s): 1.7%us, 38.6%sy, 0.0%ni, 0.7%id, 0.0%wa, 2.0%hi, 57.1%si,
> Mem: 515924k total, 209248k used, 306676k free, 57092k buffers
> Swap: 1646620k total, 0k used, 1646620k free, 82104k cached
>
> and with 1500 mtu
>
> top - 20:12:36 up 1:07, 2 users, load average: 0.55, 0.33, 0.17
> Tasks: 91 total, 1 running, 90 sleeping, 0 stopped, 0 zombie
> Cpu(s): 4.0%us, 24.8%sy, 0.0%ni, 0.0%id, 0.0%wa, 5.0%hi, 66.2%si,
> Mem: 515924k total, 223844k used, 292080k free, 57188k buffers
> Swap: 1646620k total, 0k used, 1646620k free, 98436k cached
>
>
> Seems as if the systems is under full load even when using hardware
> checksumming. So the hardware checksum engine does not really have any
> advantages compared to sw crc and 9k mtu.
This might be because since you're not using zero-copy, the data has
to be copied at least once (from userspace to kernel buffers) anyway,
so by the time we call skb_checksum_help() the data is still in the
L2 cache, or something like that.
Reply to: