Re: [PATCH] sky2: Use deferrable timer for watchdog



On Dec 20, 2007 3:04 PM, Arjan van de Ven <arjan@xxxxxxxxxxxxxxx> wrote:
I think it is reasonable for Network driver watchdogs to use a
deferrable timer - if the machine is 100% IDLE there is no one needing
the network to be up. If there is something running even on the other
CPU - that is going to cause an IPI, reschedule, TLB invalidation etc.
which will make it very likely in practice that each CPU will be
interrupted in reasonable amount of time.

this is not correct; many machines are idle waiting for network data. Think of webservers...

Yes, I forgot the receive case. So if a server was 100% IDLE and a web
server was listening for network data and we reach 0 wakeups per
second on the CPU where the network watchdog timer is scheduled to run
deferred _and_ the network link went down, it would cause the watchdog
to not run and redo the link until some one else wakes up that CPU
later.
So as long as we make sure we don't convert every timer to deferrable
we should be ok - may be this can be resolved easily by having a
non-deferrable "dont-allow-deferring-for-too-long" timer on each CPU
that just causes at least one wake up in some reasonable time delta
from the previous wakeup (whoever caused that one.) It is still
beneficial in that all deferrable timers would run at once without
needing to have separate wakeup for each.



Of course there are theoretical cases where we could land into a
situation where a CPU in a multiprocessor machine is IDLE infinitely
and that causes the watchdog that happens to be bound to run on the
same CPU to not run. To take care of these unlikely cases I think the
timer mechanism should have a reasonable limit on how long a CPU can
go IDLE if there are deferrable timers.

how about something else instead: a timer mechanism that takes a range instead..
that at least has defined semantics; the deferrable semantics really are "indefinite".
Lets keep at least the semantics clear and clean.


Would not the simpler solution of installing a non-deferrable timer
per cpu which will not allow the CPU to go IDLE for more than x units
of time at once (or something to that effect) work? Range would
complicate the thing and I am not sure how many cases will know
reasonably correct range for their normal operation. In this instance
of the e1000 watchdog what range could it give and be successful at
what it wants to do - bring up the link in reasonable amount of time,
while also realizing the power savings?

Perhaps depending on Server/Laptop/Desktop machine (may be based on
Preemption) we could have normal or deferrable timers but that'll
exclude Servers from power savings and I am not sure Data center folks
will like that :) .

Parag
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: [PATCH] sky2: Use deferrable timer for watchdog
    ... deferrable timer - if the machine is 100% IDLE there is no one needing ... CPU - that is going to cause an IPI, reschedule, TLB invalidation etc. ... many machines are idle waiting for network data. ... go IDLE if there are deferrable timers. ...
    (Linux-Kernel)
  • Re: [RFC] (How to) Let idle CPUs sleep
    ... turns out that if we restrict the amount of time idle cpus are ... cpu sleeps. ... * local timer ticks. ... +int idle_balance_retry ...
    (Linux-Kernel)
  • Re: [RFC] (How to) Let idle CPUs sleep
    ... I did some measurements on the average interval that a idle CPU is allowed to ... The external jiffie timer keeps waking it every jiffy] ... Time taken to find the 30 prime numbers was used as a benchmark ...
    (Linux-Kernel)
  • Re: [patch 6/6] x86: add c1e aware idle function
    ... This excludes those machines from high ... To work nicely with C1E enabled machines we use a separate idle ... This allows us to do timer broadcasting ... Does the boot CPU ...
    (Linux-Kernel)
  • Re: 2.6.25-rc5-git6: Reported regressions from 2.6.24
    ... With NO_HZ enabled CPU1 is in a long idle sleep. ... boot process there is probably no timer pending on CPU1, ... wheel on the idle CPU when a timer gets added from some other CPU. ...
    (Linux-Kernel)