Re: lmbench lat_mmap slowdown with CONFIG_PARAVIRT



On Tue, Jan 20, 2009 at 01:45:00PM +0100, Ingo Molnar wrote:

* Nick Piggin <npiggin@xxxxxxx> wrote:

On Tue, Jan 20, 2009 at 12:26:34PM +0100, Ingo Molnar wrote:

* Nick Piggin <npiggin@xxxxxxx> wrote:

Hi,

I'm looking at regressions since 2.6.16, and one is lat_mmap has slowed
down. On further investigation, a large part of this is not due to a
_regression_ as such, but the introduction of CONFIG_PARAVIRT=y.

Now, it is true that lat_mmap is basically a microbenchmark, however it
is exercising the memory mapping and page fault handler paths, so we're
talking about pretty important paths here. So I think it should be of
interest.

I've run the tests on a 2s8c AMD Barcelona system, binding the test to
CPU0, and running 100 times (stddev is a bit hard to bring down, and my
scripts needed 100 runs in order to pick up much smaller changes in the
results -- for CONFIG_PARAVIRT, just a couple of runs should show up the
problem).

Times I believe are in nanoseconds for lmbench, anyway lower is better.

non pv AVG=464.22 STD=5.56
paravirt AVG=502.87 STD=7.36

Nearly 10% performance drop here, which is quite a bit... hopefully
people are testing the speed of their PV implementations against non-PV
bare metal :)

Ouch, that looks unacceptably expensive. All the major distros turn
CONFIG_PARAVIRT on. paravirt_ops was introduced in x86 with the express
promise to have no measurable runtime overhead.

( And i suspect the real life mmap cost is probably even more expensive,
as on a Barcelona all of lmbench fits into the cache hence we dont see
any real $cache overhead. )

The PV kernel has over 100K larger text size, nearly 40K alone in mm/ and
kernel/. Definitely we don't see the worst of the icache or branch buffer
overhead on this microbenchmark. (wow, that's a nasty amount of bloat :( )


Jeremy, any ideas where this slowdown comes from and how it could be
fixed?

I had a bit of a poke around the profiles, but nothing stood out.
However oprofile counted 50% more cycles in the kernel with PV than with
non-PV. I'll have to take a look at the user/system times, because 50%
seems ludicrous.... hopefully it's just oprofile noise.

kbuild costs go up a bit (average of 30 builds)
elapsed
non-pv: AVG=53.31s STD=0.99
pv: AVG=53.54s STD=0.94

user
non-pv: AVG=318.63s STD=0.19
pv: AVG=319.33s STD=0.23

system
non-pv: AVG=30.56s STD=0.15
pv: AVG=31.80s STD=0.15

kernel side of the kbuild workload slows down by 4.1%. User time also
increases a bit (probably more cache and branch misses).


If you have a Core2 test-system could you please try tip/master, which
also has your do_page_fault-de-bloating patch applied?

Will try to get one to do some runs on.

Thanks,
Nick
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: lmbench lat_mmap slowdown with CONFIG_PARAVIRT
    ... promise to have no measurable runtime overhead. ... any real $cache overhead. ... The PV kernel has over 100K larger text size, nearly 40K alone in mm/ and ... Performance counter stats for 'mmap-test': ...
    (Linux-Kernel)
  • [benchmark] 1% performance overhead of paravirt_ops on native kernels
    ... Are there plans to analyze and fix this overhead too, ... I cannot cite a single other kernel feature that has so much ... # Core Netfilter Configuration ... # Device Drivers ...
    (Linux-Kernel)
  • Re: Cached memory never gets released
    ... Stock linux 2.4.26 kernel. ... Due to flash bug 3M of memory gets lost due to font memory getting lost ... The output of "free" cache number steadily grows. ... longer to exhaust all of system memory with the cache. ...
    (Linux-Kernel)
  • Re: FreeBSD 5.3 Bridge performance take II
    ... If you do believe that the slab allocator has severe performance ... Until the slab allocator is fixed the system-wide overhead will skew ... memory cache vs a per-thread memory cache. ... performance-critical subsystems and when we do, ...
    (freebsd-current)
  • Re: silent semantic changes with reiser4
    ... |>fixed ammount of space for disk cache is bad, ... kernel interface that then turns around and calls a userland daemon? ... one cache manager, the way there is exactly one VM manager for Linux. ...
    (Linux-Kernel)