Re: [patch] voluntary-preempt-2.6.9-rc1-bk4-Q7

From: Ingo Molnar (mingo_at_elte.hu)
Date: 09/02/04

  • Next message: Alan Cox: "Proposal: Changing enable_irq/disable_irq to be two argument."
    Date:	Thu, 2 Sep 2004 15:37:43 +0200
    To: Mark_H_Johnson@raytheon.com
    
    

    * Mark_H_Johnson@raytheon.com <Mark_H_Johnson@raytheon.com> wrote:

    > >In any case, please enable nmi_watchdog=1 so that we can see (in -Q7)
    > >what happens on the other CPUs during such long delays.
    >
    > Booted with nmi_watchdog=1, saw the kernel message indicating that
    > NMI was checked OK.
    >
    > The first trace looks something like this...
    >
    > latency 518 us, entries: 79
    > ...
    > started at schedule+0x51/0x740
    > ended at schedule+0x337/0x740
    >
    > 00000001 0.000ms (+0.000ms): schedule (io_schedule)
    > 00000001 0.000ms (+0.000ms): sched_clock (schedule)
    > 00010001 0.478ms (+0.478ms): do_nmi (sched_clock)
    > 00010001 0.478ms (+0.000ms): do_nmi (<08049b21>)
    > 00010001 0.482ms (+0.003ms): profile_tick (nmi_watchdog_tick)
    > ...
    > and a few entries later ends up at do_IRQ (sched_clock).
    >
    > The second trace goes from dequeue_task to __switch_to with a
    > similar pattern - the line with do_nmi has +0.282ms duration and
    > the line notifier_call_chain (profile_hook) as +0.135ms duration.
    >
    > I don't see how this provides any additional information but will
    > provide several additional traces when the test gets done in a few
    > minutes.

    thanks. The NMI gives us two kinds of information, both useful:

     - if the ratio of do_nmi()'s within such a section roughly matches the
       number of NMIs we'd expect during the sum of these sections then it
       means that the delay is most likely wall-clock time and not some
       measurement artifact (RDTSC artifact or tracing bug). The NMI's are
       triggered (indirectly) by the PIT and the PIT is an independent clock
       that has a frequency that is independent of the rest of the system
       (independent of the CPU clock, DMA activities, IRQ load, etc.)

       since most of the codepaths in question (the scheduler's
       dequeue_task(), etc.) run with interrupts disabled the normal timer
       interrupts (smp_apic_timer_interrupt() and do_IRQ(00000000)) cannot
       'sample' this codepath. Only the NMI can interrupt these codepaths.

     - the NMIs also sample what happens on the other CPU. In your above
       trace this gives:

    > 00010001 0.478ms (+0.478ms): do_nmi (sched_clock)
    > 00010001 0.478ms (+0.000ms): do_nmi (<08049b21>)

       the other CPU was executing userspace code during the last NMI tick -
       i.e. nothing that could be suspect. 'suspect' code would be some sort
       kernel code that could in theory interact with this CPU's scheduler
       code.

       this too is statistical sampling so we'll need as much of these
       traces as possible.

    some wacky guess based on the above single sampling point: it seems the
    delays are real wall-clock delays, and the only thing matching the
    theory so far is that DMA traffic on the memory bus somehow stalls this
    CPU's memory traffic for up to 500 usecs. How could userspace running on
    CPU#0 impact the kernel's scheduler code on CPU#1?

            Ingo
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/


  • Next message: Alan Cox: "Proposal: Changing enable_irq/disable_irq to be two argument."

    Relevant Pages

    • Re: [patch] Real-Time Preemption, -RT-2.6.10-rc1-mm3-V0.7.25-1
      ... delays in the CPU loop of latencytest. ... is zero during several periods of the trace so I'm surprised that we did ...
      (Linux-Kernel)
    • Re: Trace length for SDRAM lines?
      ... I have never heard anyone say that it was important to balance trace ... If the clock line is significantly longer than the other signal ... all of the signals flow from the ... equal delays to both devices then you only need to consider the delay ...
      (comp.arch.embedded)
    • Re: 2.6.19-rc2-mm1
      ... The x86_64 nmi watchdog handling looks rather complex. ... I don't think trace all cpus ... stacks of other CPUs - but CPU2 itself took an NMI and ran into ...
      (Linux-Kernel)
    • Re: [BUG] Strange 1-second pauses during Resume-from-RAM
      ... well what does the trace say, where do the delays come from? ... quick overview you can make tracing lighter weight by doing: ...
      (Linux-Kernel)
    • Re: Timing problem in ModelSim, Post-Route Simulation.
      ... But in general I'd expect to see ROUTE delays of the TRACE ... > report that match the INTERCONNECT delays of the SDF. ... the route delays of the Trace report matches the interconnect delays ...
      (comp.arch.fpga)