Re: Although CONFIG_IRQBALANCE is enabled IRQ's don't seem to be balanced very well



Nauman Tahir wrote:
On 1/10/06, Jesper Juhl <jesper.juhl@xxxxxxxxx> wrote:
On 1/10/06, Martin Bligh <mbligh@xxxxxxxxxx> wrote:
Jesper Juhl wrote:
On 1/10/06, Martin Bligh <mbligh@xxxxxxxxxx> wrote:
Josef Sipek wrote:
On Tue, Jan 10, 2006 at 12:14:42PM +0100, Jesper Juhl wrote:

Do I need any userspace tools in addition to CONFIG_IRQBALANCE?

Last I checked, yes you do need "irqbalance" (at least that's what the package is called in debian.

Nope - you need the kernel option turned on OR the userspace daemon, not both.

Ok, good to know.

If you're not generating interrupts at a high enough rate, it won't
rotate. That's deliberate.

What I have read is that first CPU is used more for interrupts to use the concept of maximizing cache locality. Probably kernel is optimizing this even with CONFIG option enabled.


Hmm, and what would count as "a high enough rate"?

This is what I tested a few months ago:

Test system: 2 dual Pentium3 systems
- with 2.6.11 kernel and kernel IRQ balancing;
- each with an Intel dual port E1000 NIC (e1000 driver 6.0.54);
- both systems connected back-to-back to each other with 2 links.

Test 1:
- I started 1 UDP flow (< 23 Mbps) on the first link with the Iperf network performance measurement tool. For a UDP bandwidth lower than 23 Mbps the interrupt rate at the receiver interface was lower than 2000 interrupts per second. In this case all interrupts were distributed to CPU 0. 2000 interrupts per second seemed to be the threshold for the interrupts to be distributed to 1 CPU.


Test 2:
- Then I started 1 UDP flow of 600 Mbps on the first link. 8000 interrupts per second were generated by the receiver interface. Approximately half of the interrupts were distributed to CPU 0, the other half to CPU 1.


Test 3:
- Then I did a test with 2 UDP flows of 600 Mbps, each over their own link. 8000 interrupts per second were generated by both receiver interfaces. All interrupts generated by the 1st interface were distributed to CPU 0, all interrupts generated by the 2nd interface were distributed to CPU 1.




I just did a small test with thousands of ping -f's through my NIC while at the same time giving the disk a good workout with tons of find's, sync's & updatedb's - that sure did drive up the number of interrupts and my load average went sky high (amazingly the box was still fairly responsive):

root@dragon:/home/juhl# uptime
22:59:58 up 12:43,  1 user,  load average: 1015.48, 715.93, 429.07
but, not a single interrupt was handled by CPU1, they all went to CPU0.

Do you have a good way to drive up the nr of interrupts above the
treshhold for balancing?

Is it HT? ISTR it was intelligent enough to ignore that. But you'd have to look at the code to be sure.


Dual Core Athlon 64 X2 4400+

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Relevant Pages

  • Re: 2.6.27-rc6 xen soft lockup
    ... device interrupts are bound to vcpu 0 and there's nothing much else to ... During the time of the lockup the cpu load, as measured with cacti, was about 4% ... So I would say mostly idle ... # Device Drivers ...
    (Linux-Kernel)
  • Re: irqbalance mandatory on SMP kernels?
    ... compile your own kernel. ... utilization letting one CPU take them all. ... "Ooh, interrupts are hurting one ... Maybe the scheduler's load balancer ...
    (Linux-Kernel)
  • Re: Interrupt using all the CPU
    ... In cases where interrupts reach 25%, the load is very easily noticable, ... is false, you wouldn't be able to log in to the box, the kernel itself ... would have trouble doing something, ... Display either cpu or io statistics. ...
    (freebsd-questions)
  • Re: [parisc-linux] [patch 15/23] Add cmpxchg_local to parisc
    ... non-SMP-safe counter that protects updates against interrupts. ... could be vastely used in the kernel. ... "Local atomic operations only guarantee variable modification atomicity ... that only one CPU writes to the local_t data. ...
    (Linux-Kernel)
  • Re: program runs almost twice as slow on 2 CPU machine when 1 CPU
    ... As I said, from the system management tool's perspective, the term 'CPU ... load' means system and user threads vs idle thread - interrupts and DPC ... interrupts - only port and miniport drivers do it. ...
    (microsoft.public.win32.programmer.kernel)