[PATCH 0/3] sched: Extend sched_mc/smt_power_savings framework.



Hi,

The following patch series extends the existing
sched_smt/mc_power_savings framework to work on platforms that
have onchip memory controller making each cpu package a 'node'

On such machines with on-chip memory controller, each physical CPU
package forms a NUMA node and the CPU level sched_domain will have
only one group. This prevents any form of power saving balance across
these nodes. Enabling the sched_mc/smt_power_savings tunable to work as
designed on these new single CPU NUMA node machines will help task
consolidation and save power as we did in other multi core multi
socket platforms.

Consolidation across NODES have implications of cross-node memory
access and other NUMA locality issues. Even under such constraints
there could be scope for power savings vs performance tradeoffs and
hence making the sched_mc/smt_powersavings work as expected on these
platform is justified.

sched_mc/smt_power_savings is still a tunable and power savings benefits
and performance would vary depending on the workload and the system
topology and hardware features.

The patch series has been tested on a 2-Socket Quad-core Dual threaded
box with kernbench as the workload, varying the number of threads.

The following results shows the average of 3 iterations of each of the runs.

|-----------------------------------------------------------------------------|
| testname | avg-time elapsed | % power consumed |
|-----------------------------------------------------------------------------|
| pm_kernbench.smt-0-mc-0_threads=4 | 95.28s | 100.00 |
| pm_kernbench.smt-0-mc-1_threads=4 | 98.06s | 100.98 |
| pm_kernbench.smt-0-mc-2_threads=4 | 99.14s | 101.98 |
| pm_kernbench.smt-1-mc-1_threads=4 | 137.62s | 92.68 |
| pm_kernbench.smt-1-mc-2_threads=4 | 142.75s | 91.89 |
| pm_kernbench.smt-2-mc-2_threads=4 | 142.63s | 92.30 |
|-----------------------------------------------------------------------------|
| pm_kernbench.smt-0-mc-0_threads=6 | 66.25s | 100.00 |
| pm_kernbench.smt-0-mc-1_threads=6 | 71.18s | 99.25 |
| pm_kernbench.smt-0-mc-2_threads=6 | 69.43s | 100.12 |
| pm_kernbench.smt-1-mc-1_threads=6 | 96.46s | 91.40 |
| pm_kernbench.smt-1-mc-2_threads=6 | 99.51s | 90.49 |
| pm_kernbench.smt-2-mc-2_threads=6 | 99.35s | 89.94 |
|-----------------------------------------------------------------------------|
| pm_kernbench.smt-0-mc-0_threads=7 | 58.20s | 100.00 |
| pm_kernbench.smt-0-mc-1_threads=7 | 62.59s | 98.12 |
| pm_kernbench.smt-0-mc-2_threads=7 | 60.73s | 99.17 |
| pm_kernbench.smt-1-mc-1_threads=7 | 83.70s | 90.47 |
| pm_kernbench.smt-1-mc-2_threads=7 | 83.31s | 88.98 |
| pm_kernbench.smt-2-mc-2_threads=7 | 83.69s | 89.51 |
|-----------------------------------------------------------------------------|
| pm_kernbench.smt-0-mc-0_threads=8 | 54.08s | 100.00 |
| pm_kernbench.smt-0-mc-1_threads=8 | 57.98s | 97.65 |
| pm_kernbench.smt-0-mc-2_threads=8 | 55.79s | 99.28 |
| pm_kernbench.smt-1-mc-1_threads=8 | 74.31s | 90.39 |
| pm_kernbench.smt-1-mc-2_threads=8 | 76.03s | 89.88 |
| pm_kernbench.smt-2-mc-2_threads=8 | 76.59s | 90.14 |
|-----------------------------------------------------------------------------|
| pm_kernbench.smt-0-mc-0_threads=9 | 51.67s | 100.00 |
| pm_kernbench.smt-0-mc-1_threads=9 | 54.64s | 97.38 |
| pm_kernbench.smt-0-mc-2_threads=9 | 52.78s | 98.81 |
| pm_kernbench.smt-1-mc-1_threads=9 | 65.91s | 91.33 |
| pm_kernbench.smt-1-mc-2_threads=9 | 66.93s | 91.36 |
| pm_kernbench.smt-2-mc-2_threads=9 | 67.18s | 90.99 |
|-----------------------------------------------------------------------------|


Thoughts on this approach?

---

Gautham R Shenoy (3):
sched: Fix sd_parent_degenerate for SD_POWERSAVINGS_BALANCE.
sched: Fix the wakeup nomination for sched_mc/smt_power_savings.
sched: code cleanup - sd_power_saving_flags(), sd_balance_for_mc/package_power()


include/linux/sched.h | 47 ++++++++++++++--------------------------
include/linux/topology.h | 6 ++---
kernel/sched.c | 54 +++++++++++++++++++++++++++++++++++++++++++---
kernel/sched_fair.c | 2 +-
4 files changed, 71 insertions(+), 38 deletions(-)

--
Thanks and Regards
gautham.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: Model and CPU
    ... Is there a list of CPU types associated with model? ... While there is not a direct command to determine processor speed in AIX V4, ... Note on some machines the ... Each 2 way processor resides on one CPU card. ...
    (comp.unix.aix)
  • Re: FS Nucore hardwarePC Hardware
    ... boards, CPU, hard drives ETC is quite a challenge. ... machines before the company went under. ... One kit remaining at this price ... CPU: Q9400 QUAD core 2.66GHz..yes a QUAD core ...
    (rec.games.pinball)
  • Re: speed up calculation suggestions
    ... Fortran matter. ... same internal formats - the ones that the CPU supports. ... the large majority of machines today use the same format. ...
    (comp.lang.fortran)
  • Re: how can i disble irq15 through C/assembly programming.
    ... Well, these machines do not run Windows, do they? ... This condition is enforced by the CPU itself, ... interrupts as a whole, but the OP wants to disable interrupts only for ... disable interrupts on CPU Y..... ...
    (microsoft.public.win32.programmer.kernel)
  • Re: [patch 6/6] x86: add c1e aware idle function
    ... This excludes those machines from high ... To work nicely with C1E enabled machines we use a separate idle ... This allows us to do timer broadcasting ... Does the boot CPU ...
    (Linux-Kernel)