Re: Network Performance Degrading over random amount of time
Am 2014-04-22 10:38, schrieb email@example.com:
Am 2014-04-20 23:49, schrieb Karl E. Jorgensen:
On Sun, Apr 20, 2014 at 01:01:53PM +0200, firstname.lastname@example.org wrote:
maybe you have a clue about the issues im having since several
My Homeserver is running Debian Jessy right now, the network issues
where there with wheezy aswell.
after a fresh boot my network behaves like it should archiving near
gbit speeds which is nice, after a random amount of uptime though my
throughput degrades below 100mbit network speeds (about ~3.5MB/s)
i measured using iperf.
You don't explicitly say... Does a reboot "cure" the problem
Yep thats exactly what a reboot does for me, i tenad to reboot about
once every 2-3 days because of this issue, not something you would
expect from a unix OS :D
If so, does a "ifdown eth0" + "ifup eth0" have the same effect?
necessary: Unplug and re-plug the cable between "ifdown" and
"ifup"...) [A full reboot is a bit like a sledge hammer... very
I have yet to try this, will report back when i have the performance
problem again and try it
Just got the chance to try, and yes, an ifdown eth0 -> Cable replug ->
ifup eth0 also cures this problem
From the point-of-view of the switch, this should be almost
indistinguishable from a full reboot...
Anything in the kernel message log? (e.g. output of "dmesg" or
/var/log/kern.log) It would be interesting if the kernel spat out some
messages around the time of the degradation... E.g. link-level
renegotiation or similar.
Also: Anything interesting in the output of "ifconfig eth0" ? I'm
particularly interested in the counters for errors, dropped, overruns,
frame/carrier counts: These counters may show interesting changes
around the time of the degradation...
I will write this down for next performance degration aswell
Output of dmesg looks a bit suspicious:
[40886.039833] irq 16: nobody cared (try booting with the "irqpoll"
[40886.039963] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.13-1-amd64 #1
[40886.039965] Hardware name: System manufacturer System Product
Name/P8H67-M PRO, BIOS 3806 08/20/2012
[40886.039967] ffff88040bdca4bc ffffffff814a1327 ffff88040bdca400
[40886.039970] ffff88040bdca400 0000000000000010 0000000000000000
[40886.039973] 0000000000000000 0000000000000000 0000000000000010
[40886.039976] Call Trace:
[40886.039977] <IRQ> [<ffffffff814a1327>] ? dump_stack+0x41/0x51
[40886.039988] [<ffffffff810aa4a8>] ? __report_bad_irq+0x28/0xc0
[40886.039991] [<ffffffff810aa93a>] ? note_interrupt+0x1ba/0x210
[40886.039994] [<ffffffff810a8471>] ?
[40886.039997] [<ffffffff810a8593>] ? handle_irq_event+0x33/0x50
[40886.040000] [<ffffffff810ab358>] ? handle_fasteoi_irq+0x58/0x100
[40886.040004] [<ffffffff81014388>] ? handle_irq+0x18/0x30
[40886.040007] [<ffffffff81013f20>] ? do_IRQ+0x40/0xb0
[40886.040011] [<ffffffff814a6ead>] ? common_interrupt+0x6d/0x6d
[40886.040012] <EOI> [<ffffffff8107eb47>] ?
[40886.040019] [<ffffffff81388c9a>] ? cpuidle_enter_state+0x4a/0xc0
[40886.040022] [<ffffffff81388db9>] ? cpuidle_idle_call+0xa9/0x1d0
[40886.040025] [<ffffffff8101adb5>] ? arch_cpu_idle+0x5/0x30
[40886.040028] [<ffffffff810a777e>] ? cpu_startup_entry+0xbe/0x280
[40886.040032] [<ffffffff8103c484>] ? start_secondary+0x1d4/0x230
[40886.040173] [<ffffffffa010a7a0>] ata_bmdma_interrupt [libata]
[40886.040336] [<ffffffffa00a5450>] e1000_intr [e1000]
[40886.040506] Disabling IRQ #16
IRQ16 is related to eth0 according to /proc/interrupts:
16: 3164992 3462922 0 0 IO-APIC-fasteoi
Output of ifconfig looks unsuspicious, a few dropped packets but nothing
eth0 Link encap:Ethernet Hardware Adresse 00:0e:0c:b9:5e:1d
inet Adresse:192.168.1.20 Bcast:192.168.1.255
UP BROADCAST RUNNING MULTICAST MTU:1500 Metrik:1
RX packets:11881446 errors:0 dropped:882 overruns:0 frame:0
TX packets:29392900 errors:0 dropped:0 overruns:0 carrier:0
RX bytes:7149517599 (6.6 GiB) TX bytes:69090488843 (64.3 GiB)
- Asus P8H67-M PRO
- Intel(R) Core(TM) i3-2100 CPU @ 3.10GHz
- 16 GB DDR3 Ram (2*8GB Kingston ram)
- Intel Corporation 82541PI Gigabit Ethernet Controller
- TP-Link 8-Port gbit switches (2 of em between home-server and
*two* switches between server and clients? Sounds a bit unusual - at
least for a home set-up...
Well im a Console Collector and my Linux Server is right behind my
Home Entertainment area in the living room. thats where the first
switch is located to hook all the entertainment stuff up to the lan,
then there is one uplink line going to the other end of the room where
the 2. switch is located, connecting my Desk PC, Laptop, printer, wifi
ap and internet Gateway to the lan aswell.
Ive tried diffrent things so far:
- Switched from a switched cabling setup to Crosslink.
Hm... AFAIK modern network cards tend to adjust themselves to both
"normal" and cross-over cables (which I believe that "crosslink"
Jup i wasnt clear enough here i guess, i connected the client (one at
a time) with the server directly using a normal off the shelf patch
cable, i just call it crosslinking because no switch is in between
- Swapped out the cheap asrock motherboard with asus
- Changed from onboard realtek network chip to PCI Intel Gbit card
Hm.. That would likely rule out any network card issues.
- Reinstalled OS several times
.. which would most likely rule out any OS bugs. But not administrator
Jup i guess (wheezy and jessy)
- Testing from diffrent clients (Win 7, Linux Mint, Debian, Ubuntu)
... which would then most likely rule out administrator mistakes: Win7
is sufficient differently from anything else to make it difficult to
make the same mistake across platforms.
And Hardware issues on my client aswell since they all are diffrent
chipset network hardware
- Downloading vendor drivers and using them instead of the kernel
Nothing so far had worked to get my gbit speeds stable over a few
When you measure the speed, between which two points do you measure
client -> tp-link -> tp-link -> server
or with direct connection circumventing the switches
client -> server
I'm concerned about the TWO TP-Link switches: The diagnostics you have
done so far does not appear to rule them out.... Does your traffic
have to pass through both of them? If so, how are they switches
imho i ruled them out with directly connecting my client(s) one at a
time to the server using an patch cable circumventing those switches
in question. The reported degration in network speed happens there
Based on what you have written, my main suspects would be the two
switches - with a focus on the "nearest" switch...
im open to ANY suggestions here even if they involve building a
custom kernel or other magical hakkery ;D
Well - it looks like you have put a fair amount of effort into solving
this.... But until the problem is narrowed down, this would probably
be as likely to resolve the problem as a goat sacrifice ... You
haven't got a spare goat, have you? :-)
Mhm i have a few gots (Long Goat, Feather Goat etc...)
Dunno if they count as spare's?
Hope this helps
Jup defenetely, a few more ideas and things i should gather to help
debugging pinpointed :)
 I'm assuming eth0 here....
 A live one would constitute a "hot spare", right? Yeah. Tangent.
Karl E. Jorgensen
Lukas Wingerberg ²