Re: next-20081119: general protection fault: get_next_timer_interrupt()



Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
Well, not sure. Most likely candidate is the new block timer code.
What seems to be happening is that the queue is being released with
either an outstanding request (refcounting problem) or ticking timer
with no work (block timer problem). The way scanning works is that we
create a request queue for each device we probe and then delete it again
if nothing appears after the bus settle time. The argument against
this is that it should show up on every scanned bus. However, these are
getting rarer; I was just about to write that I hadn't seen it when I
remembered that all my SCSI testing systems are currently running
hotplug reporting busses (i.e. don't do scanning). However,
fortunately, I've also booted voyager recently which does use parallel
SCSI and doesn't see this either, so it could also be megaraid_sas
specific.

Yeah, block could it be as well. Jens, Mike ?

I added a comment to bug 12020 on Thursday about a few other systems that
where seeing the signature shown in bug 12020. It appeared from debug that
there where a few paths that where adding timers for requests that where
not expected.

http://bugzilla.kernel.org/show_bug.cgi?id=12020

It would be good to know if the debug patch below effects your problem as while.

If it does we need to investigated a solution to resolve not adding a
timer for these requests.

-andmike
--
Michael Anderson
andmike@xxxxxxxxxxxxxxxxxx



blk: blk_add_timer debug patch

[DEBUG] Debug only patch.

Debug patch to blk_add_timer to not start timer for request that do not
have the REQ_STARTED flag set.

Signed-off-by: Mike Anderson <andmike@xxxxxxxxxxxxxxxxxx>
---
block/blk-timeout.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/block/blk-timeout.c b/block/blk-timeout.c
index 69185ea..4389391 100644
--- a/block/blk-timeout.c
+++ b/block/blk-timeout.c
@@ -177,6 +177,9 @@ void blk_add_timer(struct request *req)
BUG_ON(!list_empty(&req->timeout_list));
BUG_ON(test_bit(REQ_ATOM_COMPLETE, &req->atomic_flags));

+ if (!(req->cmd_flags & REQ_STARTED))
+ return;
+
if (req->timeout)
req->deadline = jiffies + req->timeout;
else {
--
1.5.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages