Re: Default cache_hot_time value back to 10ms

From: Andrew Theurer (habanero_at_us.ibm.com)
Date: 10/07/04

  • Next message: Chris Wright: "Re: [patch 1/3] lsm: add bsdjail module"
    To: linux-kernel@vger.kernel.org
    Date:	Thu, 7 Oct 2004 10:58:53 -0500
    
    
    

    > OK... Well Andrew as I said I'd be happy for this to go in. I'd be *extra*
    > happy if Judith ran a few of those dbt thingy tests which had been
    > sensitive to idle time. Can you ask her about that? or should I?
    >
    > > As a side note, I'd like to get involved on future scheduler tuning
    >
    > experiments,
    >
    > > we have fair amount of benchmark environments where we can validate
    > > things
    >
    > across
    >
    > > various kind of workload, i.e., db, java, cpu, etc. Thanks.
    >
    > That would be very welcome indeed. We have a big backlog of scheduler
    > things to go in after 2.6.9 is released (although not many of them change
    > the runtime behaviour IIRC). After that, I have some experimental
    > performance work that could use wider testing. After *that*, the
    > multiprocessor scheduler will in a state where 2.6 shouldn't need much more
    > work, so we can concentrate on just tuning the dials.

    I'd like to add some comments as well:

    1) We are seeing similar problems with that "well known" DB transaction
    benchmark, as well as another well known benchmark measuring multi-tier J2EE
    server performance. Both problems are with load balancing. It's not quite
    the same situation. We have too much idle time and not enough throughput.
    Giving a more aggressive idle balance has helped there. The 3 areas we have
    changed at:

    wake_idle() -find the first idle cpu, statring with cpu->sd and moving up the
    sd's as needed. Modify SD_NODE_INIT.flags and SD_CPU_INIT.flags to include
    SD_WAKE_IDLE. Now, if there is an idle cpu (and task->cpu is busy), we move
    it to the closest idle cpu.

    can_migrate() put back (again) the aggressive idle condition in can_migrate().
    Do not look at task_hot when we have an idle cpu.

    idle_balance() / SD_NODE_INIT add SD_BALANCE_NEWIDLE to SD_NODE_INIT.flags
    so a newly idle_balance can try to balance from an appropriate cpu, first a
    cpu close to it, then farther out.

    (the above changes IMO could also pave the way for removing timer based -idle-
    balances)

    IMO, I don't think idle cpus should play by the exact same rules as busy ones
    when load balacing. I am not saying the only answer is not looking at cache
    warmth at all, but maybe a much more relaxed policy.

    Also, finding (at boot time) the best cache_hot_time is a step in the right
    direction, but I have to wonder it cache_hot() is really doing the right
    thing. It looks like all cache_hot does is decide this task is cache hot
    because it ran recently. Who's to say the task got cache warm in the first
    place? Shouldn't we be looking at both how long ago it ran and the length of
    time it ran? Some of these workloads have very high transaction rates, and
    in turn have very high context switch rates. I would be surprised if many of
    the tasks got more than enough continuous run time to get good cache warmth
    anyway. I am all for testing chace warmth, but I think we should start
    looking at more than just how long ago the task ran.

    -Andrew Theurer

    
    
    
    

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/





  • Next message: Chris Wright: "Re: [patch 1/3] lsm: add bsdjail module"

    Relevant Pages