[tip:sched/core] sched: Improve latencies and throughput



Commit-ID: 0ec9fab3d186d9cbb00c0f694d4a260d07c198d9
Gitweb: http://git.kernel.org/tip/0ec9fab3d186d9cbb00c0f694d4a260d07c198d9
Author: Mike Galbraith <efault@xxxxxx>
AuthorDate: Tue, 15 Sep 2009 15:07:03 +0200
Committer: Ingo Molnar <mingo@xxxxxxx>
CommitDate: Tue, 15 Sep 2009 16:51:16 +0200

sched: Improve latencies and throughput

Make the idle balancer more agressive, to improve a
x264 encoding workload provided by Jason Garrett-Glaser:

NEXT_BUDDY NO_LB_BIAS
encoded 600 frames, 252.82 fps, 22096.60 kb/s
encoded 600 frames, 250.69 fps, 22096.60 kb/s
encoded 600 frames, 245.76 fps, 22096.60 kb/s

NO_NEXT_BUDDY LB_BIAS
encoded 600 frames, 344.44 fps, 22096.60 kb/s
encoded 600 frames, 346.66 fps, 22096.60 kb/s
encoded 600 frames, 352.59 fps, 22096.60 kb/s

NO_NEXT_BUDDY NO_LB_BIAS
encoded 600 frames, 425.75 fps, 22096.60 kb/s
encoded 600 frames, 425.45 fps, 22096.60 kb/s
encoded 600 frames, 422.49 fps, 22096.60 kb/s

Peter pointed out that this is better done via newidle_idx,
not via LB_BIAS, newidle balancing should look for where
there is load _now_, not where there was load 2 ticks ago.

Worst-case latencies are improved as well as no buddies
means less vruntime spread. (as per prior lkml discussions)

This change improves kbuild-peak parallelism as well.

Reported-by: Jason Garrett-Glaser <darkshikari@xxxxxxxxx>
Signed-off-by: Mike Galbraith <efault@xxxxxx>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
LKML-Reference: <1253011667.9128.16.camel@xxxxxxxxxxxxxxxx>
Signed-off-by: Ingo Molnar <mingo@xxxxxxx>


---
arch/ia64/include/asm/topology.h | 5 +++--
arch/powerpc/include/asm/topology.h | 2 +-
arch/sh/include/asm/topology.h | 3 ++-
arch/x86/include/asm/topology.h | 4 +---
include/linux/topology.h | 2 +-
kernel/sched_features.h | 2 +-
6 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/arch/ia64/include/asm/topology.h b/arch/ia64/include/asm/topology.h
index 47f3c51..42f1673 100644
--- a/arch/ia64/include/asm/topology.h
+++ b/arch/ia64/include/asm/topology.h
@@ -61,7 +61,7 @@ void build_cpu_to_node_map(void);
.cache_nice_tries = 2, \
.busy_idx = 2, \
.idle_idx = 1, \
- .newidle_idx = 2, \
+ .newidle_idx = 0, \
.wake_idx = 0, \
.forkexec_idx = 1, \
.flags = SD_LOAD_BALANCE \
@@ -87,10 +87,11 @@ void build_cpu_to_node_map(void);
.cache_nice_tries = 2, \
.busy_idx = 3, \
.idle_idx = 2, \
- .newidle_idx = 2, \
+ .newidle_idx = 0, \
.wake_idx = 0, \
.forkexec_idx = 1, \
.flags = SD_LOAD_BALANCE \
+ | SD_BALANCE_NEWIDLE \
| SD_BALANCE_EXEC \
| SD_BALANCE_FORK \
| SD_BALANCE_WAKE \
diff --git a/arch/powerpc/include/asm/topology.h b/arch/powerpc/include/asm/topology.h
index a6b220a..1a2c9eb 100644
--- a/arch/powerpc/include/asm/topology.h
+++ b/arch/powerpc/include/asm/topology.h
@@ -57,7 +57,7 @@ static inline int pcibus_to_node(struct pci_bus *bus)
.cache_nice_tries = 1, \
.busy_idx = 3, \
.idle_idx = 1, \
- .newidle_idx = 2, \
+ .newidle_idx = 0, \
.wake_idx = 0, \
.flags = SD_LOAD_BALANCE \
| SD_BALANCE_EXEC \
diff --git a/arch/sh/include/asm/topology.h b/arch/sh/include/asm/topology.h
index 9054e5c..c843677 100644
--- a/arch/sh/include/asm/topology.h
+++ b/arch/sh/include/asm/topology.h
@@ -15,13 +15,14 @@
.cache_nice_tries = 2, \
.busy_idx = 3, \
.idle_idx = 2, \
- .newidle_idx = 2, \
+ .newidle_idx = 0, \
.wake_idx = 0, \
.forkexec_idx = 1, \
.flags = SD_LOAD_BALANCE \
| SD_BALANCE_FORK \
| SD_BALANCE_EXEC \
| SD_BALANCE_WAKE \
+ | SD_BALANCE_NEWIDLE \
| SD_SERIALIZE, \
.last_balance = jiffies, \
.balance_interval = 1, \
diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index 4b1b335..7fafd1b 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -116,14 +116,12 @@ extern unsigned long node_remap_size[];

# define SD_CACHE_NICE_TRIES 1
# define SD_IDLE_IDX 1
-# define SD_NEWIDLE_IDX 2
# define SD_FORKEXEC_IDX 0

#else

# define SD_CACHE_NICE_TRIES 2
# define SD_IDLE_IDX 2
-# define SD_NEWIDLE_IDX 2
# define SD_FORKEXEC_IDX 1

#endif
@@ -137,7 +135,7 @@ extern unsigned long node_remap_size[];
.cache_nice_tries = SD_CACHE_NICE_TRIES, \
.busy_idx = 3, \
.idle_idx = SD_IDLE_IDX, \
- .newidle_idx = SD_NEWIDLE_IDX, \
+ .newidle_idx = 0, \
.wake_idx = 0, \
.forkexec_idx = SD_FORKEXEC_IDX, \
\
diff --git a/include/linux/topology.h b/include/linux/topology.h
index c87edcd..4298745 100644
--- a/include/linux/topology.h
+++ b/include/linux/topology.h
@@ -151,7 +151,7 @@ int arch_update_cpu_topology(void);
.cache_nice_tries = 1, \
.busy_idx = 2, \
.idle_idx = 1, \
- .newidle_idx = 2, \
+ .newidle_idx = 0, \
.wake_idx = 0, \
.forkexec_idx = 1, \
\
diff --git a/kernel/sched_features.h b/kernel/sched_features.h
index 891ea0f..e98c2e8 100644
--- a/kernel/sched_features.h
+++ b/kernel/sched_features.h
@@ -67,7 +67,7 @@ SCHED_FEAT(AFFINE_WAKEUPS, 1)
* wakeup-preemption), since its likely going to consume data we
* touched, increases cache locality.
*/
-SCHED_FEAT(NEXT_BUDDY, 1)
+SCHED_FEAT(NEXT_BUDDY, 0)

/*
* Prefer to schedule the task that ran last (when we did
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: newidle balancing in NUMA domain?
    ... I would say please keep domain balancing behaviour at least ... somewhat close to how it was with Oscheduler at least until CFS is ... encoded 600 frames, 252.82 fps, 22096.60 kb/s ...
    (Linux-Kernel)
  • Re: newidle balancing in NUMA domain?
    ... I would say please keep domain balancing behaviour at least ... somewhat close to how it was with Oscheduler at least until CFS is ... encoded 600 frames, 252.82 fps, 22096.60 kb/s ...
    (Linux-Kernel)
  • Re: 1080p30 vs 1080p60
    ... significantdifference between 30 or 60 frames per second? ... we assume that the source was recorded at 60 fps, ... prevalent video standard. ... Movies have long been shot at 24 fps, ...
    (alt.tv.tech.hdtv)
  • Re: apple TV deemed an iFlop
    ... case it certainly does support 1080i through the DVI. ... it might "support" it but the refresh rate of shared video is painful ... I thought that the "p" automatically means 60 fps and the ... frames per second and 1080i is virtually always 60 fields per second. ...
    (comp.sys.mac.advocacy)

Loading