Re: [PATCH] Fix crash with irqpoll due to the IRQF_IRQPOLL flag





On Tue, 22 May 2007, Bernhard Walle wrote:

o System crashes if booted with irqpoll command line option.

o Problem happens because Inside note_interrupt() we are accessing
desc->action->flag without taking the desc->lock. While accessing it
somebody goes ahead and unregisters the irq handler hence desc->action
is NULL. By the time note_interrupt() checks it, it crashes.

I absolutely _detest_ patches that make already complex and unreadable
code even more so. Especially conditionals. Please don't do that.

If you need a variable for a conditional, make it be an implicit one from
an inline function, and aim for making it readable.

So how about instead writing it out as a nice self-explanatory inline
function? I can almost guarantee that this generates no worse code, and it
also makes it easy to explain things like "we don't bother with the lock,
because we don't care enough".

Untested, but I think the point of the patch is obvious. Anybody want to
test it, send it back to me, and fix the bug while making the code more
readable?

I think we should instate a hard new rule:

- if a bugfix doesn't make the code more readable, it's not a bugfix at
all.

There was a reason for the bug in the first place, and that reason is
likely that the code was hard to think about. So making it even _harder_
to think about isn't really fixing the deeper problem!

Linus

---
kernel/irq/spurious.c | 46 +++++++++++++++++++++++++++++++++++++---------
1 files changed, 37 insertions(+), 9 deletions(-)

diff --git a/kernel/irq/spurious.c b/kernel/irq/spurious.c
index b0d81aa..bd9e272 100644
--- a/kernel/irq/spurious.c
+++ b/kernel/irq/spurious.c
@@ -135,6 +135,39 @@ report_bad_irq(unsigned int irq, struct irq_desc *desc, irqreturn_t action_ret)
}
}

+static inline int try_misrouted_irq(unsigned int irq, struct irq_desc *desc, irqreturn_t action_ret)
+{
+ struct irqaction *action;
+
+ if (!irqfixup)
+ return 0;
+
+ /* We didn't actually handle the IRQ - see if it was misrouted? */
+ if (action_ret == IRQ_NONE)
+ return 1;
+
+ /*
+ * But for 'irqfixup == 2' we also do it for handled interrupts if
+ * they are marked as IRQF_IRQPOLL (or for irq zero, which is the
+ * traditional PC timer interrupt.. Legacy)
+ */
+ if (irqfixup < 2)
+ return 0;
+
+ if (!irq)
+ return 1;
+
+ /*
+ * Since we don't get the descriptor lock, "action" can
+ * change under us. We don't really care, but we don't
+ * want to follow a NULL pointer. So tell the compiler to
+ * just load it once by using a barrier.
+ */
+ action = desc->action;
+ barrier();
+ return action && (action->flags & IRQF_IRQPOLL);
+}
+
void note_interrupt(unsigned int irq, struct irq_desc *desc,
irqreturn_t action_ret)
{
@@ -144,15 +177,10 @@ void note_interrupt(unsigned int irq, struct irq_desc *desc,
report_bad_irq(irq, desc, action_ret);
}

- if (unlikely(irqfixup)) {
- /* Don't punish working computers */
- if ((irqfixup == 2 && ((irq == 0) ||
- (desc->action->flags & IRQF_IRQPOLL))) ||
- action_ret == IRQ_NONE) {
- int ok = misrouted_irq(irq);
- if (action_ret == IRQ_NONE)
- desc->irqs_unhandled -= ok;
- }
+ if (unlikely(try_misrouted_irq(irq, desc, action_ret))) {
+ int ok = misrouted_irq(irq);
+ if (action_ret == IRQ_NONE)
+ desc->irqs_unhandled -= ok;
}

desc->irq_count++;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: Anonymous and local class names changed in 1.5
    ... >>see a reason for this other than to annoy me (and break the tools I use to ... I don't see any bug. ... bugfix, IIRC, because the previous names could be ambiguous, but the 1.4.2. ...
    (comp.lang.java.programmer)
  • Re: Damn you, FEDEX! or Nikon D40 lost in Springfield, MO blackhole.
    ... the 2 mp Mavica he had been using with a Nikon D40. ... After shopping around, he got me to order one for him. ... The shipper had it insured, but from what I have read it could take weeks to sort this crap out. ... You may get your insurance from FedEx and a couple weeks later they find it and deliver it. ...
    (alt.photography)
  • Re: Shutdown or Restart gives Blue Screen "SPECIAL_POOL_DETECTED_MEMORY_CORRUPTION"
    ... Anytime I try to do Shutdown or Restart, ... the system crashes with a Blue Screen of Death message: ... Could have caught a bug,.. ...
    (microsoft.public.windows.vista.general)
  • System crashes at random
    ... - the system crashes without any reason ... graphics card ... Power Management ... Does anyone has a clue what the reason could be or what I still should ...
    (microsoft.public.win2000.hardware)
  • Re: python 3 constant
    ... is require Python to change to accommodate your need. ... Can you see the slight difference? ... Best regards, ...
    (comp.lang.python)