[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Network Bandwidth issue in VLAN-Router



Hi,

there is good news: on a newly installed buster system, the issue seems to be 
gone :-)

root@home-buster:~# uname -a
Linux home-buster 4.19.0-5-armmp #1 SMP Debian 4.19.37-5+deb10u1 (2019-07-19) 
armv7l GNU/Linux
root@home-buster:~# ethtool eth0
Settings for eth0:
        Supported ports: [ TP MII ]
        Supported link modes:   10baseT/Half 10baseT/Full 
                                100baseT/Half 100baseT/Full 
                                1000baseT/Full 
        Supported pause frame use: Symmetric
        Supports auto-negotiation: Yes
        Supported FEC modes: Not reported
        Advertised link modes:  10baseT/Half 10baseT/Full 
                                100baseT/Half 100baseT/Full 
                                1000baseT/Full 
        Advertised pause frame use: Symmetric
        Advertised auto-negotiation: Yes
        Advertised FEC modes: Not reported
        Link partner advertised link modes:  10baseT/Half 10baseT/Full 
                                             100baseT/Half 100baseT/Full 
                                             1000baseT/Full 
        Link partner advertised pause frame use: Symmetric Receive-only
        Link partner advertised auto-negotiation: Yes
        Link partner advertised FEC modes: Not reported
        Speed: 1000Mb/s
        Duplex: Full
        Port: MII
        PHYAD: 0
        Transceiver: internal
        Auto-negotiation: on
        Supports Wake-on: d
        Wake-on: d
        Link detected: yes
root@home-buster:~# iperf3 -c scw -R
iperf3: error - unable to connect to server: Invalid argument
root@home-buster:~# iperf3 -c scw.bokomoko.de -R
Connecting to host scw.bokomoko.de, port 5201
Reverse mode, remote host scw.bokomoko.de is sending
[  5] local 2a02:8070:898f:e400:d263:b4ff:fe00:325c port 35822 connected to 
2001:bc8:4700:2300::c:a11 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec   973 KBytes  7.97 Mbits/sec                  
[  5]   1.00-2.00   sec  4.04 MBytes  33.9 Mbits/sec                  
[  5]   2.00-3.00   sec  15.1 MBytes   126 Mbits/sec                  
[  5]   3.00-4.00   sec  25.3 MBytes   212 Mbits/sec                  
[  5]   4.00-5.00   sec  24.1 MBytes   202 Mbits/sec                  
[  5]   5.00-6.00   sec  25.3 MBytes   212 Mbits/sec                  
[  5]   6.00-7.00   sec  25.3 MBytes   212 Mbits/sec                  
[  5]   7.00-8.00   sec  25.3 MBytes   212 Mbits/sec                  
[  5]   8.00-9.00   sec  25.2 MBytes   212 Mbits/sec                  
[  5]   9.00-10.00  sec  25.3 MBytes   212 Mbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   199 MBytes   167 Mbits/sec    3             sender
[  5]   0.00-10.00  sec   196 MBytes   164 Mbits/sec                  receiver

iperf Done.
root@home-buster:~#

Important is that flow-control is enabled in the switch (or whatever device is 
connected to the cubox-i), otherwise I end up with much lower bandwidth (note, 
the actual bandwidth depends on the latency of the connection, long latency 
means low remaining bandwidth)

root@home-buster:~# iperf3 -c scw.bokomoko.de -R
Connecting to host scw.bokomoko.de, port 5201
Reverse mode, remote host scw.bokomoko.de is sending
[  5] local 2a02:8070:898f:e400:d263:b4ff:fe00:325c port 35828 connected to 
2001:bc8:4700:2300::c:a11 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec   227 KBytes  1.86 Mbits/sec                  
[  5]   1.00-2.00   sec   254 KBytes  2.08 Mbits/sec                  
[  5]   2.00-3.00   sec   407 KBytes  3.34 Mbits/sec                  
[  5]   3.00-4.00   sec   356 KBytes  2.91 Mbits/sec                  
[  5]   4.00-5.00   sec   331 KBytes  2.71 Mbits/sec                  
[  5]   5.00-6.00   sec   339 KBytes  2.78 Mbits/sec                  
[  5]   6.00-7.00   sec   519 KBytes  4.25 Mbits/sec                  
[  5]   7.00-8.00   sec   381 KBytes  3.12 Mbits/sec                  
[  5]   8.00-9.00   sec   287 KBytes  2.35 Mbits/sec                  
[  5]   9.00-10.00  sec   406 KBytes  3.32 Mbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  3.50 MBytes  2.93 Mbits/sec   48             sender
[  5]   0.00-10.00  sec  3.42 MBytes  2.87 Mbits/sec                  receiver

iperf Done.
root@home-buster:~# 



I am surprised that a stretch system with a backports kernel behaves so 
different, I thought the workaround for the hardware issue in the imx6 is 
handled inside the kernel ( https://boundarydevices.com/i-mx6-ethernet/ ), but 
stretch looks so different (no pause frame support, ~ 100x lower bandwidth):

root@home:~# uname -a
Linux home 4.19.0-0.bpo.5-armmp #1 SMP Debian 4.19.37-4~bpo9+1 (2019-06-19) 
armv7l GNU/Linux
root@home:~# ls -l /boot/dtb-4.19.0-0.bpo.5-armmp 
lrwxrwxrwx 1 root root 43 Jul 27 23:21 /boot/dtb-4.19.0-0.bpo.5-armmp -> dtbs/
4.19.0-0.bpo.5-armmp/imx6q-cubox-i.dtb
root@home:~# ethtool eth0
Settings for eth0:
        Supported ports: [ TP MII ]
        Supported link modes:   10baseT/Half 10baseT/Full 
                                100baseT/Half 100baseT/Full 
                                1000baseT/Full 
        Supported pause frame use: No
        Supports auto-negotiation: Yes
        Advertised link modes:  10baseT/Half 10baseT/Full 
                                100baseT/Half 100baseT/Full 
                                1000baseT/Full 
        Advertised pause frame use: No
        Advertised auto-negotiation: Yes
        Link partner advertised link modes:  10baseT/Half 10baseT/Full 
                                             100baseT/Half 100baseT/Full 
                                             1000baseT/Full 
        Link partner advertised pause frame use: Symmetric Receive-only
        Link partner advertised auto-negotiation: Yes
        Speed: 1000Mb/s
        Duplex: Full
        Port: MII
        PHYAD: 0
        Transceiver: internal
        Auto-negotiation: on
        Supports Wake-on: d
        Wake-on: d
        Link detected: yes
root@home:~#

root@home:~# iperf3 -c scw -R
Connecting to host scw, port 5201
Reverse mode, remote host scw is sending
[  4] local 2a02:8070:898f:e400:d263:b4ff:fe00:325c port 49792 connected to 
2001:bc8:4700:2300::c:a11 port 5201
[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-1.00   sec   261 KBytes  2.14 Mbits/sec                  
[  4]   1.00-2.00   sec   219 KBytes  1.79 Mbits/sec                  
[  4]   2.00-3.00   sec   273 KBytes  2.24 Mbits/sec                  
[  4]   3.00-4.00   sec   202 KBytes  1.66 Mbits/sec                  
[  4]   4.00-5.00   sec   202 KBytes  1.66 Mbits/sec                  
[  4]   5.00-6.00   sec   172 KBytes  1.41 Mbits/sec                  
[  4]   6.00-7.00   sec   198 KBytes  1.62 Mbits/sec                  
[  4]   7.00-8.00   sec   370 KBytes  3.03 Mbits/sec                  
[  4]   8.00-9.00   sec   269 KBytes  2.20 Mbits/sec                  
[  4]   9.00-10.00  sec   325 KBytes  2.66 Mbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.00  sec  2.49 MBytes  2.09 Mbits/sec   55             sender
[  4]   0.00-10.00  sec  2.49 MBytes  2.09 Mbits/sec                  receiver

iperf Done.
root@home:~#

The interesting part is: will the problem also be fixed during an upgrade to 
buster (I showed a new install above)?

Thanks for reading that far :-)
Rainer


> Hi,
> 
> I found in the meantime
> 
> https://boundarydevices.com/i-mx6-ethernet/
> 
> describes most likely the issue I see, in particular the transfer rate
> degradation to 3 Mbits/s is what I see
> 
> root@linaro-nano:~# tsecs=2 incr=200 ./bwtest.sh
> ----------bandwidth 200
> [  4]  0.0- 2.0 sec  48.1 MBytes   203 Mbits/sec   0.061 ms  164/34479
> (0.48%) [  3]  0.0- 2.0 sec  48.3 MBytes   203 Mbits/sec   0.034 ms   
> 0/34483 (0%) ----------bandwidth 400
> [  4]  0.0- 2.0 sec  96.5 MBytes   405 Mbits/sec   0.040 ms   67/68911
> (0.097%)
> [  3]  0.0- 1.9 sec  93.9 MBytes   406 Mbits/sec   0.035 ms 1990/68965
> (2.9%) ----------bandwidth 600
> [  4]  0.0- 2.0 sec   110 MBytes   460 Mbits/sec   0.030 ms  234/78615
> (0.3%) [  3]  0.0- 2.3 sec   110 MBytes   410 Mbits/sec  15.672 ms
> 26703/105262 (25%) ----------bandwidth 800
> [  4]  0.0- 2.0 sec   110 MBytes   461 Mbits/sec   0.033 ms    0/78511 (0%)
> [  3]  0.0- 2.2 sec  2.91 MBytes  11.1 Mbits/sec  101.865 ms 140266/142342
> (99%)
> ----------bandwidth 1000
> [  4]  0.0- 2.0 sec   110 MBytes   461 Mbits/sec   0.033 ms    0/78383 (0%)
> [  3]  0.0- 0.2 sec  90.4 KBytes  3.18 Mbits/sec  110.420 ms 141295/141358
> (1e+02%)
> 
> in addition, I see
> 
> root@home:~# ethtool eth0
> Settings for eth0:
>         Supported ports: [ TP MII ]
>         Supported link modes:   10baseT/Half 10baseT/Full
>                                 100baseT/Half 100baseT/Full
>                                 1000baseT/Full
>         Supported pause frame use: No
>         Supports auto-negotiation: Yes
>         Advertised link modes:  10baseT/Half 10baseT/Full
>                                 100baseT/Half 100baseT/Full
>                                 1000baseT/Full
>         Advertised pause frame use: No
>         Advertised auto-negotiation: Yes
>         Link partner advertised link modes:  10baseT/Half 10baseT/Full
>                                              100baseT/Half 100baseT/Full
>                                              1000baseT/Full
>         Link partner advertised pause frame use: Symmetric Receive-only
>         Link partner advertised auto-negotiation: Yes
>         Speed: 1000Mb/s
>         Duplex: Full
>         Port: MII
>         PHYAD: 0
>         Transceiver: internal
>         Auto-negotiation: on
>         Supports Wake-on: d
>         Wake-on: d
>         Link detected: yes
> root@home:~#
> 
> shows the same output as in their before scenario causing the bandwidth
> degradation.
> 
> Is anybody else seeing this with an imx6 device like the cubox-i?
> 
> Thanks
> Rainer
> 
> PS:
> What still puzzles me is that I see this issue only if I leave the subnet.
> Possibly there are other mechanism to limit the traffic in case of overruns
> on a local network, but here I am guessing (?)
> 
> Am Sonntag, 28. Juli 2019, 05:46:36 CEST schrieb Nicholas Geovanis:
> > I can tell you that i have precisely this issue in Chicago. But the fact
> > is
> > that for me it was a result of rate-limiting at the IP provider ATT. It is
> > not necessarily related directly, but senior citizens :-) may recall the
> > differential up/down bandwidth on ISDN.
> > At my last apartment i had fiber directly into my bedroom. Here it is over
> > copper to the building wiring. I took a 25% hit on bandwidth up and down.
> > I
> > yelled at them for a rate reduction, but no dice.
> > 
> > On Sat, Jul 27, 2019, 8:24 AM Rainer Dorsch <ml@bokomoko.de> wrote:
> > > Hi,
> > > 
> > > I have a stretch box configured as VLAN router (Cubox -i2ex). There is a
> > > drastic difference between the bandwidth of the uplink (VLAN1) and the
> > > downlinks (VLAN 2 to 7):
> > > 
> > > On 192.168.7.1 (VLAN 7: eth0.7) I see arround 9 MB/s in a simple test:
> > > rd@home:~$ wget -O /dev/null http://fs/debian-9.3.0-amd64-netinst.iso
> > > [...] (9.08 MB/s)
> > > rd@home:~$
> > > 
> > > On 192.168.0.30 (VLAN 1: eth0.1) is see less than 10%:
> > > rd@home:~$ wget -O /dev/null
> > > https://git.kernel.org/torvalds/t/linux-5.3-rc1.tar.gz
> > > --2019-07-27 14:46:38--
> > > https://git.kernel.org/torvalds/t/linux-5.3-rc1.tar.gz
> > > [...] (339KB/s)
> > > 
> > > To prove that it has nothing to do with the uplink (there is a Fritzbox
> > > 6430)
> > > itself, I connected another  machine on same VLAN 1 (192.168.0.203). So
> > > overall, the network looks like this
> > > 
> > > 
> > > Internet
> > > 
> > > 
> > > Fritz-Box
> > > 
> > > |       192.168.0.203
> > > |
> > > |--------------------------------------- x86
> > > |
> > > | 192.168.0.30
> > > 
> > > Cubox i
> > > 
> > > | 192.168.7.*
> > > 
> > > Note, the Cubox-i has only 1 physical interface, drawn are the virtual
> > > interface.
> > > 
> > > The x86 machine reaches also a much higher network bandwidth:
> > > 
> > > rd@h370:~/tmp.nobackup$ wget -O /dev/null
> > > https://git.kernel.org/torvalds/t/
> > > linux-5.3-rc1.tar.gz
> > > <https://git.kernel.org/torvalds/t/linux-5.3-rc1.tar.gz>
> > > [...] (5,49 MB/s)
> > > rd@h370:~/tmp.nobackup$
> > > 
> > > I did run ifstat to confirm that there is no other traffic which
> > > consumes
> > > all the
> > > bandwidth:
> > > 
> > > rd@home:/etc/shorewall$ ifstat -i eth0.1 1
> > > 
> > >       eth0.1
> > >  
> > >  KB/s in  KB/s out
> > >  
> > >   495.00     16.34
> > >   417.14     16.03
> > >   484.33     16.53
> > >   384.80     11.96
> > >   393.33     12.67
> > >   632.59     17.68
> > >   607.90     17.91
> > >   354.39     12.00
> > >   678.58     20.97
> > >  
> > >  1119.24     26.88
> > >  1185.20     27.27
> > >  
> > >   925.91     21.84
> > >  
> > >  1245.82     27.88
> > >  
> > >   940.69     26.06
> > >  
> > >  1023.72     26.89
> > >  1114.13     26.33
> > >  
> > >   997.74     24.56
> > >   876.72     19.49
> > >  
> > >  1167.56     27.73
> > >  
> > >   906.41     24.30
> > >  
> > >  1127.62     27.36
> > >  
> > >   919.79     20.01
> > >   915.31     20.95
> > >   990.86     23.97
> > >  
> > >  1119.22     26.94
> > >  
> > >   905.54     26.40
> > >  
> > >  1143.21     28.44
> > >  1096.15     26.98
> > >  
> > >   924.62     24.89
> > >  
> > >  1076.53     24.87
> > >  1004.04     23.99
> > >  
> > >   811.11     23.13
> > >   983.71     24.46
> > >   885.05     23.19
> > >  
> > >  1052.26     43.33
> > >  1230.55     37.11
> > >  1517.61     33.67
> > >  
> > >   818.60     24.37
> > >  
> > >  1057.24     26.63
> > >  1131.38     26.47
> > >  1278.43     30.12
> > >  1123.24     24.31
> > >  
> > >   788.14     21.74
> > >   757.56     23.86
> > >  
> > >  1135.29     27.91
> > >  1161.76     25.15
> > >  1465.32     32.04
> > >  1175.41     26.16
> > >  1371.36     31.56
> > >  
> > >   811.73     21.70
> > >   540.97     16.91
> > >   381.78     20.95
> > >   306.44     13.07
> > >   378.02     12.93
> > >   603.65     16.67
> > >   418.31     16.35
> > >   393.71     16.46
> > >   479.89     15.05
> > >   436.74     13.85
> > >   395.96     12.05
> > >   476.41     14.51
> > >   470.09     18.23
> > >   322.02     12.20
> > >   427.35     14.33
> > >   464.25     14.39
> > >   404.19     11.09
> > >   999.41     24.39
> > >   931.23     24.89
> > > 
> > > Also top does not show any obvious overload:
> > > 
> > > rd@home:~$ top
> > > top - 14:51:53 up 34 days,  1:18,  3 users,  load average: 1.10, 1.11,
> > > 1.24
> > > Tasks: 177 total,   2 running, 125 sleeping,   0 stopped,   0 zombie
> > > %Cpu(s):  5.2 us,  5.2 sy,  0.0 ni, 87.9 id,  0.2 wa,  0.0 hi,  1.5 si,
> > > 0.0
> > > st
> > > KiB Mem :  2062764 total,    89308 free,   239480 used,  1733976
> > > buff/cache
> > > KiB Swap:  4726172 total,  4726172 free,        0 used.  1740396 avail
> > > Mem
> > > 
> > >  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+
> > > 
> > > COMMAND
> > > 
> > > 
> > > 13341 rd        20   0   12072   7196   3716 S   3.6  0.3   0:01.33 wget
> > > 
> > > 
> > > 
> > > 7221 root      20   0   62236   3508   2468 S   2.9  0.2 394:36.44
> > > owserver
> > > 
> > > 
> > > 1257 rd        20   0    4036   2304   2052 S   2.0  0.1 945:39.61 reed-
> > > contact-mo
> > > 
> > > 
> > > 7687 rd        20   0    4712   2508   1596 S   2.0  0.1 571:54.03
> > > monitor.sh
> > > 
> > > 
> > > 13597 rd        20   0    7212   2900   2276 R   1.6  0.1   0:00.53 top
> > > 
> > > 
> > > 
> > > 1040 asterisk  20   0  221968  58292  30316 S   1.3  2.8 565:37.76
> > > asterisk
> > > 
> > >   17 root      20   0       0      0      0 S   0.7  0.0 434:16.89
> > > 
> > > ksoftirqd/1
> > > 
> > >   10 root      20   0       0      0      0 R   0.3  0.0  94:31.09
> > > 
> > > rcu_sched
> > > 
> > >  202 root      20   0   27144   6192   5760 S   0.3  0.3  89:57.07
> > >  systemd-
> > > 
> > > journal
> > > 
> > > 
> > > 8678 root       0 -20       0      0      0 I   0.3  0.0   0:22.47
> > > kworker/
> > > 2:2H-kb
> > > 
> > > 
> > > 21622 root      20   0       0      0      0 I   0.3  0.0   0:01.57
> > > kworker/
> > > u8:3-ev
> > > 
> > > 
> > > 32581 root       0 -20       0      0      0 I   0.3  0.0   0:18.03
> > > kworker/
> > > 0:1H-kb
> > > 
> > >    1 root      20   0   26672   5236   3852 S   0.0  0.3   2:42.16
> > > 
> > > systemd
> > > 
> > >    2 root      20   0       0      0      0 S   0.0  0.0   0:07.14
> > > 
> > > kthreadd
> > > 
> > >    3 root       0 -20       0      0      0 I   0.0  0.0   0:00.00
> > >    rcu_gp
> > >    
> > >    
> > >    
> > >    4 root       0 -20       0      0      0 I   0.0  0.0   0:00.00
> > > 
> > > rcu_par_gp
> > > 
> > >    8 root       0 -20       0      0      0 I   0.0  0.0   0:00.00
> > > 
> > > mm_percpu_wq
> > > 
> > >    9 root      20   0       0      0      0 S   0.0  0.0   4:43.82
> > > 
> > > ksoftirqd/0
> > > 
> > >   11 root      20   0       0      0      0 I   0.0  0.0   0:00.00
> > >   rcu_bh
> > >   
> > >   
> > >   
> > >   12 root      rt   0       0      0      0 S   0.0  0.0   2:43.11
> > > 
> > > migration/0
> > > 
> > > So in summary:
> > > -> The uplink of the cubox to the internet is slow (<10% of available
> > > bandwidth)
> > > -> The cubox can run on the physical interface (the0) much higher
> > > traffic
> > > (as
> > > shown on VLAN 7)
> > > -> Another x86 host can run much higher traffic into the Internet
> > > 
> > > Any idea on what could restrict the bandwidth on the Cubox uplink is
> > > very
> > > welcome. Also any ideas to diagnose the issue further would be useful
> > > for
> > > me.
> > > 
> > > Many thanks
> > > Rainer
> > > 
> > > 
> > > --
> > > Rainer Dorsch
> > > http://bokomoko.de/


-- 
Rainer Dorsch
http://bokomoko.de/



Reply to: