Re: The performance and behaviour of the anti-fragmentation related patches



On Fri, 2 Mar 2007 17:40:04 -0800 William Lee Irwin III <wli@xxxxxxxxxxxxxx> wrote:
My gut feeling is to agree, but I get nagging doubts when I try to
think of how to boil things like [major benchmarks whose names are
trademarked/copyrighted/etc. censored] down to simple testcases. Some
other things are obvious but require vast resources, like zillions of
disks fooling throttling/etc. heuristics of ancient downrev kernels.

On Fri, Mar 02, 2007 at 05:58:56PM -0800, Andrew Morton wrote:
noooooooooo. You're approaching it from the wrong direction.
Step 1 is to understand what is happening on the affected production
system. Completely. Once that is fully understood then it is a relatively
simple matter to concoct a test case which triggers the same failure mode.
It is very hard to go the other way: to poke around with various stress
tests which you think are doing something similar to what you think the
application does in the hope that similar symptoms will trigger so you can
then work out what the kernel is doing. yuk.

Yeah, it's really great when it's possible to get debug info out of
people e.g. they're willing to boot into a kernel instrumented with
the appropriate printk's/etc. Most of the time it's all guesswork.
People who post to lkml are much better about all this on average.

I never truly understood the point of kprobes/jprobes/dprobes (or
whatever the probing letter is), crash dumps, and so on until I ran
into this, not that I use personally them (though I may yet start).
Most of the time I just read the code instead and smoke out what
could be going on by something like the process of devising
counterexamples. For instance, I told that colouroff patch guy about
the possibility of getting the wrong page for the start of the buffer
from virt_to_page() on a cache colored buffer pointer (clearly
cache->gfporder >= 4 in such a case). Deriving the head page without
__GFP_COMP might be considered to be ugly-looking, though.


-- wli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: Better pagecache statistics ?
    ... thing (mostly for debug). ... > 2) not all the concepts you mention really exist as far as the kernel is ... I'm concerned about the performance impact (eg you can ... > debug info will in itself impact the vm balances) ...
    (Linux-Kernel)
  • Re: PROBLEM: 2.4 oops: proc_pid_stat()
    ... server in another state I first tried going ahead and upgrading to 2.6.17.3. ... Keywords (i.e., modules, networking, kernel): ... A small shell script or example program which triggers the ...
    (Linux-Kernel)
  • Re: [RFC 00/15] x86_64: Optimize percpu accesses
    ... no short-term hope of fixing a problem it triggers. ... and 4.2.1 are known broken for the kernel. ... blacklisted due to known gcc bugs are 4.1.0 and 4.1.1. ... There had been need of rain for many days. ...
    (Linux-Kernel)
  • Re: [sparc64] 2.6.18 unaligned accesses in eth1394
    ... David Miller wrote: ... The second one triggers on every packet received, the first only triggers once in a while. ... If you want more gdb info or a disassembly just ask. ... into your kernel, reproduce, and post the kernel log messages ...
    (Linux-Kernel)
  • Re: Simple script that locks up my box with recent kernels
    ... >>> if I can get it to die that way. ... I already tried doing a git bisect, but I somehow messed it up ... (probably by concluding that a bad kernel was good). ... The problem is that *usually* triggers fairly quickly, ...
    (Linux-Kernel)