[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Hyperthreading problem with IRQ handling and scheduling

Sven Groot put forth on 3/4/2011 12:42 AM:

> My question then is twofold. Firstly, why are all interrupts being handled
> by the first CPU? 

You should read this:

> I checked the various /proc/irq/#/smp_affinity entries and
> they are all 0000ffff so that's not the issue. By changing the value in
> those files to a specific CPU I can get the interrupts to be handled by a
> different CPU, but that just moves the problem. No matter what I do, I can't
> get them to be handled by more than one CPU. I've tried running irqbalance
> but that also didn't help. Is there a way to prevent this interrupt CPU
> affinity, and if so would it fix my problem?

You can only divide interrupt processing by assigning IRQs to specific
CPUs.  You can't divide up the stream of interrupts in a round robin
fashion.  So if you have one device IRQ# that's generating all the
interrupts, there's not much you can do to fix this situation.

What device is generating these massive interrupts?  Network card or
disk controller?  Note that PCIe NICs often have two interrupts, one for
transmit and one for receive.  I'm not sure about disk/RAID controllers.
 It would likely depend on the model.  In the NIC case you can stick
each IRQ# on a difference CPU.

Some motherboards route IRQ signals from given sets of slots to a given
CPU socket.  Read the documentation for your system board and find out
which slot IRQs are routed to which CPU sockets.  Simply moving a card
to another slot may help significantly if your high IRQ load is due to
multiple cards and not just one.

> Secondly, why does the scheduler not realize that satisfying natural
> affinity is not a good idea if the CPUs involved are logical siblings of
> each other on the same physical CPU? I thought that the Linux kernel was
> hyperthreading-aware and would take these kinds of things into
> consideration. Is this a true shortcoming of the scheduler, or is my system
> misconfigured somehow?

See my previous reply.  And do post about this on lkml.  You'll get more
thorough answers there.


Reply to: