Re: PATCH/RFC: [kdump] fix APIC shutdown sequence



On Wed, Aug 08, 2007 at 10:06:13AM -0400, Chip Coldwell wrote:
On Wed, 8 Aug 2007, Vivek Goyal wrote:

On Tue, Aug 07, 2007 at 07:41:30PM +0200, Martin Wilck wrote:

Can you explain how, on the front side bus, the IO-APIC knows whether
a CPU has accepted the INT message? There is no response
to the INT message on the bus, except for the EOI which comes much later.
I'm not saying that you're wrong, I just really don't understand this
point.


I don't know what is exactly hardware protocol. I am just going by
intel documentation.

I think it's important to distinguish between the LAPIC receiving an
interrupt and the CPU receiving an interrupt. The former could happen
without the latter if the CPU has set the TPR above the priority of
the interrupt received by the LAPIC. In that case, the interrupt is
kept pending in the LAPIC and recorded in the IRR if I understand the
Intel documentation correctly.

So I think the scenario which leaves IRR set when the kdump kernel
starts is possible.

Hi Chip,

That's true. I agree that once kdump kernel starts we very well can be
in a situation where ISR/IRR bits of LAPIC are set and IRR bit of IOAPIC
is set. These are pending interrupt which will be delivered in the
second kernel and that kernel will reject these pending interrupts as spurious
interrupt and send an EOI. This will clear the LAPIC state.

But the issue here seems to be that LAPIC state got clear but IRR bit
at IOAPIC bit is not cleared because IOAPIC vector information was deleted
in first kernel and now upon receiving EOI, it does not know this EOI belongs
to which vector.

Given the fact that "irqpoll" makes things work, I think we should not
complicate the logic and leave it like that. Anyway, our goal is to capture
the dump and not make sure in second kernel interrupt from all the devices
are coming.

Otherwise we shall have to resort to techniques like first masking all
the interrupts at IOAPIC level, then issuing EOI on all the cpus in the
first kernel itself. It makes the logic little twisted.

Thanks
Vivek
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • [PATCH] Update Documentation/DocBook/kernel-hacking.tmpl
    ... Kernel Hacking. ... not associated with any process, serving a softirq, tasklet or bh; ... For example, while a softirq is running on a CPU, no other ... but a hardware interrupt can. ...
    (Linux-Kernel)
  • Re: PATCH/RFC: [kdump] fix APIC shutdown sequence
    ... the IO-APIC gets stuck if it sends an IRQ ... that never received an EOI. ... destination CPU has accepted the interrupt. ... it will not accept the interrupt and IRR bit of IOAPIC ...
    (Linux-Kernel)
  • Re: [parisc-linux] [patch 15/23] Add cmpxchg_local to parisc
    ... could be vastely used in the kernel. ... the local ops has just been standardized in 2.6.22 though a patchset I ... I always thought preemption required some sort of interrupt or trap. ... that only one CPU writes to the local_t data. ...
    (Linux-Kernel)
  • Re: Help with a driver im writing.
    ... Tasklets don't interfere each other on single CPU with non-preemeptable ... put it into a critical section with interrupt off anyway. ... > It's running on kernel version 2.6.14 single intel cpu, ... > My write routine is a loop in which i shift data out to the ...
    (comp.os.linux.development.system)
  • Re: Threads and processes on Linux
    ... > task runs in the same memory space as the parent task. ... > LinuxThreads and/or older 2.4.x kernel. ... CPU, and so a lot of time is being spent loading the memory accessed by the ... I only got thread affinity - I didn't get interrupt affinity. ...
    (comp.os.linux.misc)