RE: [PATCH RFC] smt nice introduces significant lock contention



Con Kolivas wrote on Thursday, June 01, 2006 6:59 PM
On Friday 02 June 2006 09:57, Chen, Kenneth W wrote:
Chris Mason wrote on Thursday, June 01, 2006 3:56 PM

Hello everyone,

Recent benchmarks showed some performance regressions between 2.6.16 and
2.6.5. We tracked down one of the regressions to lock contention in
schedule heavy workloads (~70,000 context switches per second)

kernel/sched.c:dependent_sleeper() was responsible for most of the lock
contention, hammering on the run queue locks. The patch below is more of
a discussion point than a suggested fix (although it does reduce lock
contention significantly). The dependent_sleeper code looks very
expensive to me, especially for using a spinlock to bounce control
between two different siblings in the same cpu.

Yeah, this also sort of echo a recent discovery on one of our benchmarks
that schedule() is red hot in the kernel. I was just scratching my head
not sure what's going on. This dependent_sleeper could be the culprit
considering it is called almost at every context switch. I don't think
we are hitting on lock contention, but by the large amount of code it
executes. It really needs to be cut down ....

Thanks for the suggestion. How about something like this which takes your
idea and further expands on it. Compiled, boot and runtime tests ok.

/*
+ * Try to lock all the siblings of a cpu that is already locked. As we're
+ * only doing trylock the order is not important.
+ */
+static int trylock_smt_siblings(cpumask_t sibling_map)
+{
+ cpumask_t locked_siblings;
+ int i;
+
+ cpus_clear(locked_siblings);
+ for_each_cpu_mask(i, sibling_map) {
+ if (!spin_trylock(&cpu_rq(i)->lock))
+ break;
+ cpu_set(i, locked_siblings);
+
+ }
+
+ /* Did we lock all the siblings? */
+ if (cpus_equal(sibling_map, locked_siblings))
+ return 1;
+
+ for_each_cpu_mask(i, locked_siblings)
+ spin_unlock(&cpu_rq(i)->lock);
+ return 0;
+}


I like Chris's version of trylock_smt_siblings(). Why create all that
local variables? sibling_map is passed by value, so the whole thing is
duplicated on the stack (I think it should be pass by pointer), then
there is another locked_siblings mask declared followed by a bitmask
compare. The big iron people who set CONFIG_NR_CPUS=1024 won't be
amused because of all that bitmask copying.

Let me hack up something too, so we can compare notes etc.

- Ken
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • RE: [PATCH RFC] smt nice introduces significant lock contention
    ... Recent benchmarks showed some performance regressions between 2.6.16 and ... We tracked down one of the regressions to lock contention in schedule ... this also sort of echo a recent discovery on one of our benchmarks ...
    (Linux-Kernel)
  • Re: [PATCH RFC] smt nice introduces significant lock contention
    ... Recent benchmarks showed some performance regressions between 2.6.16 ... We tracked down one of the regressions to lock contention ... this also sort of echo a recent discovery on one of our benchmarks ... that schedule() is red hot in the kernel. ...
    (Linux-Kernel)
  • Re: [PATCH RFC] smt nice introduces significant lock contention
    ... Recent benchmarks showed some performance regressions between 2.6.16 ... We tracked down one of the regressions to lock contention ... you have the hack lock. ...
    (Linux-Kernel)
  • Re: [PATCH RFC] smt nice introduces significant lock contention
    ... kernel/sched.c:dependent_sleeperwas responsible for most of the lock ... we are hitting on lock contention, but by the large amount of code it ... dependent_sleeper by only doing trylock on the smt sibling runqueues. ... * Try to lock all the siblings of a cpu that is already locked. ...
    (Linux-Kernel)
  • [ckpatch][2/29] sched-revise_smt_nice_locking
    ... Initial report and lock contention fix from Chris Mason: ... convert wake_sleeping_dependent to use trylock ... * Called with interrupt disabled and this_rq's runqueue locked. ...
    (Linux-Kernel)

Loading