Re: Analysis of sched_mc_power_savings
- From: Vaidyanathan Srinivasan <svaidy@xxxxxxxxxxxxxxxxxx>
- Date: Wed, 9 Jan 2008 16:43:02 +0530
* Siddha, Suresh B <suresh.b.siddha@xxxxxxxxx> [2008-01-08 13:24:00]:
On Tue, Jan 08, 2008 at 11:08:15PM +0530, Vaidyanathan Srinivasan wrote:
Hi,
The following experiments were conducted on a two socket dual core
intel processor based machine in order to understand the impact of
sched_mc_power_savings scheduler heuristics.
Thanks for these experiments. sched_mc_power_savings is mainly for
the reasonably long(which can be caught by periodic load balance) running
light load(when number of tasks < number of logical cpus). This tunable
will minimize the number of packages carrying load, enabling the other
completely idle packages to go to the available deepest P and C states.
Hi Suresh,
Thanks for the explanation. I tried few more experiments based on
your comments and found that in a long running CPU intensive job, the
heuristics did help to get the load on the same package.
The scheduler heuristics for multi core system
/sys/devices/system/cpu/sched_mc_power_savings should ideally extend
the cpu tickless idle time atleast on few CPU in an SMP machine.
Not really. Ideally sched_mc_power_savings re-distributes the load(in
the scenarios mentioned above) to different logical cpus in the system, so
as to minimize the busy packages in the system.
But the problem is we will not be able get longer idle time and goto
lower sleep states on other packages if we are not able to move the
short bursts of daemon jobs to busy packages.
We are looking at various means to increase uninterrupted CPU idle
times on atleast some of the CPUs in an under-utilised SMP machine.
Experiment 1:...
-------------
Observations with sched_mc_power_savings=1:
* No major impact of sched_mc_power_savings on CPU0 and CPU1
* Significant idle time improvement on CPU2
* However, significant idle time reduction on CPU3
In your setup, CPU0 and 3 are the core siblings? If so, this is probably
expected. Previously there was some load distribution happening between
packages. With sched_mc_power_savings, load is now getting distributed
between cores in the same package. But on my systems, typically CPU0 and 2
are core siblings. I will let you confirm your system topology, before we
come to conclusions.
No, in contrary to your observation, on my machine, CPU0 and CPU1 are
siblings on the first package while CPU2 and CPU3 are in the second
package.
Experiment 2:...
-------------
Observations with sched_mc_power_savings=1:
* No major impact of sched_mc_power_savings on CPU0 and CPU1
* Good idle time improvement on CPU2 and CPU3
I would have expected similar results like in experiment-1. CPU3 data
seems almost same with and without sched_mc_power_savings. Atleast the
data is not significantly different as in other cases like CPU2 for ex.
There is a general improvement in idle time in this experiment as
compared to the first experiment. So if sched_mc will not impact
short running tasks (daemons and other event managers in idle system),
then the observation on experiment-1 may be just a random
redistribution of tasks over time.
Please review the experiment and comment on how the effectiveness of
sched_mc_power_savings can be analysed.
While we should see a no change or a simple redistribution(to minimize
busy packages) in your experiments, for evaluating sched_mc_power_savings,
we can also use some lightly loaded system (like kernel-compilation or
specjbb with 2 threads on a DP with dual-core, for example and see how the
load is distributed with and without MC power savings.)
I have tried running two CPU intensive threads. With sched_mc turned
ON, they were scheduled on CPU0-1 or CPU2-3. With sched_mc turned
OFF, I can see that they were scheduled arbitrarily.
Later I added random usleep() to make them consume less CPU, in that
case the scheduling was very random. The two threads with random
sleeps were scheduled on any of the 4 CPUs and the sched_mc parameter
had no effect is consolidating the threads to one package. Addition
of sleeps makes the threads very short run and not CPU intensive.
This completely validates your explanation of consolidating only 'long
running' jobs.
How do we take this technique to the next step where we can
consolidate short running jobs as well? Did you face any difficulty
biasing the CPU for short running jobs?
Similarly it will be interesting to see how this data varies with and without
tickless.
I am sure tickless has very significant impact on the idle time.
I will get you more data on this comparison.
For some loads which can't be caught by periodic load balancer, we may
not see any difference. But atleast we should not see any scheduling
anomalies.
True, we did not see any scheduling anomalies yet.
Thanks,
Vaidy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
- Follow-Ups:
- Re: Analysis of sched_mc_power_savings
- From: Ingo Molnar
- Re: Analysis of sched_mc_power_savings
- References:
- Analysis of sched_mc_power_savings
- From: Vaidyanathan Srinivasan
- Re: Analysis of sched_mc_power_savings
- From: Siddha, Suresh B
- Analysis of sched_mc_power_savings
- Prev by Date: Re: AF_UNIX MSG_PEEK bug?
- Next by Date: Re: Possible 2.6.24-rc7 issue w/respect to pthreads
- Previous by thread: Re: Analysis of sched_mc_power_savings
- Next by thread: Re: Analysis of sched_mc_power_savings
- Index(es):
Relevant Pages
|
|