Re: Question regarding ERR in /proc/interrupts.

From: linux-os (linux-os_at_chaos.analogic.com)
Date: 01/12/05

  • Next message: Andrea Arcangeli: "Re: thoughts on kernel security issues"
    Date:	Wed, 12 Jan 2005 15:16:48 -0500 (EST)
    To: Justin Piszcz <jpiszcz@lucidpixels.com>
    
    

    On Wed, 12 Jan 2005, Justin Piszcz wrote:

    > Is there anyway to log each ERR to a file or way to find out what caused each
    > ERR?
    >
    > For example, I know this is the cause of a few of them:
    > spurious 8259A interrupt: IRQ7.
    >
    > But not all 20, is there any available option to do this?
    >
    > $ cat /proc/interrupts
    > CPU0
    > 0: 887759057 XT-PIC timer
    > 1: 3138 XT-PIC i8042
    > 2: 0 XT-PIC cascade
    > 5: 5811 XT-PIC Crystal audio controller
    > 9: 265081861 XT-PIC ide4, eth1, eth2
    > 10: 9087912 XT-PIC ide6, ide7
    > 11: 837707 XT-PIC ide2, ide3
    > 12: 13854 XT-PIC i8042
    > 14: 63373075 XT-PIC eth0
    > NMI: 0
    > ERR: 20
    >

    I'm not sure you really want to do that! The ERR value is a
    spurious interrupt total. You will never learn where
    it comes from because it comes from nowhere, which is
    why it is called "spurious". Spurious interrupts are
    really caused by the CPU, not a particular interrupt
    controller. When the INT line is raised, the hardware
    is supposed to put an address on the bus so the CPU can
    branch to the handler (via some indirection). The
    INT pin to the CPU is supposed to be manipulated
    by a controller, either the PIC or IO-APIC.

    Suppose the controller didn't raise an interrupt, but
    the CPU thought it did. In that case, when the CPU signals
    the controller to output the vector, the controller says;
    "Dohhh... WTF. It's not me...". But the CPU needs some
    address to complete the cycle so the controller puts its
    last, lowest priority, vector on the bus to complete the
    cycle. The CPU branches to the code and the code checks
    for a possible printer interrupt (IRQ7). If the printer
    didn't signal, it used to write a nasty-gram to the log
    before acknowledging the interrupt. Recent kernels only
    write such once. However, the number of such instances
    are totaled for your review. If you have a lot of them,
    it generally means you have:

    (1) Too much crosstalk on the motherboard.
    (2) Power supplies out of specification.
    (3) Too hot so timing gets skewed.
    (4) Etc.

    It's NEVER the interrupt controller! NEVER. The Spurious
    interrupt proves that the controller did its job by completing
    the hardware handshake with the CPU. Don't kill the messenger.
    It's just doing its job!

    FYI 20 spurious interrupts out of the bazzillion shown isn't
    too bad. It shows that your hardware isn't perfect.

    Cheers,
    *** Johnson
    Penguin : Linux version 2.6.10 on an i686 machine (5537.79 BogoMips).
      Notice : All mail here is now cached for review by Dictator Bush.
                      98.36% of all statistics are fiction.
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/


  • Next message: Andrea Arcangeli: "Re: thoughts on kernel security issues"