Re: Spreading NIC interrupts across multiple CPUs

To: debian-user@lists.debian.org
Subject: Re: Spreading NIC interrupts across multiple CPUs
From: Aaron Seelye <aseelye-lists@eltopia.com>
Date: Fri, 28 Mar 2014 10:32:50 -0700
Message-id: <[🔎] 5335B242.9060901@eltopia.com>
In-reply-to: <[🔎] 5333B78C.9090503@hardwarefreak.com>
References: <[🔎] 53331C52.3060304@eltopia.com> <[🔎] 53334A40.8020403@hardwarefreak.com> <[🔎] 5333534A.1070903@eltopia.com> <[🔎] 5333B78C.9090503@hardwarefreak.com>

On 3/26/2014 10:30 PM, Stan Hoeppner wrote:

This is an 8 core machine with HT enabled, 16 logical CPUs, so right off
the bat it is dramatically different than the Compaq machine below as
far as the kernel is concerned and how scheduling is performed.  The
current mask may or may not be correct for this configuration.  I never
use HT and I can't find any docs about HT and /proc/irq/xx/smp_affinity.

Agreed on finding the docs, it was nigh impossible. I found a way tooffload the traffic for that server, made a few changes to the BIOS(c-states, HT, etc), and booted it back up. Didn't seem to change muchon the spreading, but that's fine.

And to this point, it's not usually a good idea to spread interrupts
round robin from any device evenly across all cores in a system.  This
is inefficient as each core must load the ISR for every interrupt.  This
decreases the effectiveness of L1/L2 caches on all cores, causing
additional cache misses for other processes executing on those cores.
This is precisely why irqbalance was created.

A couple things on this, I did see what you're talking about WRTspreading the interrupts about the processors. However, I did noticeone thing, irqbalance is set to specifically exempt ethernet/networkinterfaces from its balancing. I'm not sure if it's to make sure what Iwas seeing with the HP system doesn't inadvertently happen, or to makesure the queues all stay on the same processor. This would lead me tomy next question, in the case of a NIC with multiple queues, should allqueues for a given interface be on a single CPU (actual cpu, not HT)?(answered next paragraph)

However, the Dell is using
CPU0 exclusively for the ethernet device interrupts, while the HP
spreads them pretty evenly.


This could be as simple at HT being enabled on the Dell.  If not, the
contents of your /proc/interrupts files should help me narrow this down
for you.

Unfortunately it didn't change anything on the Dell, no idea why. Couldbe as simple as the driver differences for the 5708 and 5709.

Looking athttps://we.riseup.net/riseup+tech/balancing-hardware-interrupts and morespecificallyhttp://www.alexonlinux.com/msi-x-the-right-way-to-spread-interrupt-load,it looks like the queues enabled on the 5709 (which is on the Dell)would enable me to manually balance the queues across multiple coreswithout problems. I'd been under the impression that MSI-X was what wasto blame for the HP spreading things about, but I see that's not the case.

So far, under one day including a typical peak load, it looks like thiswas rather successful, as I hit normal traffic patterns without droppingany outbound packets.


For future reference, kernel scheduler problems such as this should be
posted on LKML, not a distro list, no matter which distro you use.
There are very few people on debian-user or any of the distro general
help lists with significant knowledge of the kernel, let alone the
scheduler.  You typically get help with this kind of thing much faster,
and with more thorough knowledge transfer on LKML.

Will do. I'm sorry, but I thought this would have been a prettystandard question for anyone operating in a production environment where100k pps is typical (at least, that's what set it off for me). Eitherway, I've definitely learned a lot more about this sort of thing andhave a solution that seems to be working well without any real hocuspocus going on. Thank you for steering me in the right direction.


-Aaron

Reply to:

References:
- Spreading NIC interrupts across multiple CPUs
  - From: Aaron Seelye <aseelye-lists@eltopia.com>
- Re: Spreading NIC interrupts across multiple CPUs
  - From: Stan Hoeppner <stan@hardwarefreak.com>
- Re: Spreading NIC interrupts across multiple CPUs
  - From: Aaron Seelye <aseelye-lists@eltopia.com>
- Re: Spreading NIC interrupts across multiple CPUs
  - From: Stan Hoeppner <stan@hardwarefreak.com>

Prev by Date: Re: .Xresources not loading on start
Next by Date: Re: v4l error using vlc
Previous by thread: Re: Spreading NIC interrupts across multiple CPUs
Next by thread: Connman vs network manager
Index(es):
- Date
- Thread