[PATCH] sched.c: Be a bit more conservative in SMP



I've often seen the following use case happening on the few linux SMP boxes
I have access to : one process eats one cpu becaus eit has a big
computation to do, all cpu being idle, and the process keeps on hopping
from one cpu to another.
This patch is a quick try to make this behaviour disapear without requiring
to bind all processes manually with taskset.
I don't know if there is any practical performance increase (although I
believe there locally is).

Patch principle is simple :
When calculating the load of "source" cpu (the one the process is on)
substract one to the number of runing processes so we don't count the
process to be balanced.
As I only know sched.c for 5 minutes, I added a max(..., 0) to make sure the
load can't be negative if the function happens to be called on a cpu with
only idle tasks. No idea if it can actually happen.

I tested its efficiency this way :
Before :
-start a command eating one full cpu on an idle smp machine.
I used dd if=/dev/urandom of=/dev/null.
-wait for ~30 seconds, and see that it switched to another cpu.
After :
-repeat the same test and see that it does not switch to another cpu (the
patch does what it's meant to).
-start a second dd, and bind both to the same cpu with taskset, then free
one of them (allow it to use 2 cpus, including the one it can already
access) and see that the task gets moved to the second cpu (load balancing
still works).

Disclaimer :
This patch is just the result of a 5 minutes hacking rush. Although I think
it technically work, I'm no SMP expert.

--- linux-2.6-2.6.17/kernel/sched.c 2006-06-18 03:49:35.000000000 +0200
+++ linux-2.6-2.6.17-conservative/kernel/sched.c 2006-09-03
13:18:11.000000000 +0200
@@ -952,7 +952,7 @@ void kick_process(task_t *p)
static inline unsigned long source_load(int cpu, int type)
{
runqueue_t *rq = cpu_rq(cpu);
- unsigned long load_now = rq->nr_running * SCHED_LOAD_SCALE;
+ unsigned long load_now = (max(rq->nr_running - 1, 0)) *
SCHED_LOAD_SCALE;
if (type == 0)
return load_now;

--
Vincent Pelletier

Attachment: pgppyQQKBkgq7.pgp
Description: PGP signature



Relevant Pages

  • [PATCH] SMP alternatives
    ... Can you include the patch in -mm to give it some testing? ... This patch implements SMP alternatives, ... +/* Replace instructions with better alternatives for this CPU type. ... +void alternatives_smp_module_del ...
    (Linux-Kernel)
  • Re: X200: Brightness broken since 2.6.29-rc4-58-g4c098bc
    ... ACPI: Local APIC address 0xfee00000 ... Using ACPI for SMP configuration information ... Extended CMOS year: 2000 ... Switched to high resolution mode on CPU 1 ...
    (Linux-Kernel)
  • Re: -mm seems significanty slower than mainline on kernbench
    ... I think that I got that part wrong and you can end up with a bias load to be moved which is less than any of the bias_prio values for any queued tasks. ... I think that this analysis is a strong argument for my original patch being the cause of the problem so I'll go ahead and generate a fix. ... This indicates that, on average, 98.9% of the total available CPU was used by the build. ...
    (Linux-Kernel)
  • Re: 2.6.{26.2,27-rc} oops on virtualbox
    ... ACPI: PM-Timer IO Port: 0x4008 ... SMP: Allowing 1 CPUs, 0 hotplug CPUs ... PM: Registered nosave memory: 000000000009f000 - 00000000000a0000 ... CPU: Trace cache: 12K uops, ...
    (Linux-Kernel)
  • Re: 2.6.{26.2,27-rc} oops on virtualbox
    ... ACPI: PM-Timer IO Port: 0x4008 ... SMP: Allowing 1 CPUs, 0 hotplug CPUs ... PM: Registered nosave memory: 000000000009f000 - 00000000000a0000 ... CPU: Trace cache: 12K uops, ...
    (Linux-Kernel)