[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Handling irqbalance in virtual environments


It turns out we got again problems with irqbalance.

It was added as recommends of the main image in 3.16, as it was reported
that older kernels move all interrupts to CPU 0 without help.[1]

In the meantime the kernel can do balancing on it's own.  In 4.9, I've
seen it working with aacraid, each queue gets hard pinned to it's own
CPU from 0 to $NRCPUS.  In 4.19 I've seen the same working properly with

With 4.19, even on real hardware, where interrupts have an affinity for
all cpus, each interrupt is actually delivered to different cpu.

Random example for this, it even selects only one thread of each core:

|  26:    0    0    0    0   92    0    0    0  IR-PCI-MSI 3670017-edge      eno1-TxRx-0
|  27:    0    0    0    0    0  167    0    0  IR-PCI-MSI 3670018-edge      eno1-TxRx-1
|  28:    0    0    0    0    0    0  467    0  IR-PCI-MSI 3670019-edge      eno1-TxRx-2
|  29:    0    0    0    0    0    0    0  454  IR-PCI-MSI 3670020-edge      eno1-TxRx-3

Now irqbalance comes to re-do the existing pinning, and the result is not
longer correct but $RANDOM for the hard queue-to-cpu case of virtio.

At least Google considers the work irqbalance does to "correct" the existing
balancing a large problem.

I'm not sure how to go forward.  I have a workaround pending for our
cloud images to hard exclude the installation of irqbalance.[2]


[1]: https://bugs.debian.org/577788
[2]: https://salsa.debian.org/cloud-team/debian-cloud-images/merge_requests/81
Youth doesn't excuse everything.
		-- Dr. Janice Lester (in Kirk's body), "Turnabout Intruder",
		   stardate 5928.5.

Reply to: