Re: [RFC][PATCH] Fix hang in posix_locks_deadlock()
- From: "J. Bruce Fields" <bfields@xxxxxxxxxxxx>
- Date: Fri, 26 Oct 2007 13:07:50 -0400
On Thu, Oct 18, 2007 at 02:57:59PM -0400, George G. Davis wrote:
On Wed, Oct 17, 2007 at 02:51:57PM -0400, George G. Davis wrote:
---
Not sure if this is the correct fix but it does resolve the hangs we're
observing in posix_locks_deadlock().
Please disregard the previous patch, it's not quite right - causes occasional
segfaults and clearly did not retain the posix_same_owner() checks implemented
in the original code. Here's a new version which I believe retains the
intent of the original code:
diff --git a/fs/locks.c b/fs/locks.c
index 7f9a3ea..e012b27 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -702,14 +702,12 @@ static int posix_locks_deadlock(struct file_lock *caller_fl,
{
struct file_lock *fl;
-next_task:
if (posix_same_owner(caller_fl, block_fl))
return 1;
list_for_each_entry(fl, &blocked_list, fl_link) {
if (posix_same_owner(fl, block_fl)) {
- fl = fl->fl_next;
- block_fl = fl;
- goto next_task;
+ if (posix_same_owner(caller_fl, fl))
+ return 1;
}
}
return 0;
It may take multiple steps to identify a deadlock. With the above
you'll miss deadlocks like
process 1 is requesting a lock held by process 2
process 2 is blocking on a lock held by process 3
process 3 is blocking on a lock held by process 1.
Could you give more details about how you're causing
posix_locks_deadlock to hang? Is there a simple test-case you can post?
--b.
-
I'm not sure about those "fl = fl->fl_next; block_fl = fl;" statements,
first, the order of those statements seems reversed to me. Otherwise,
I think the intent was to advance the "fl" for loop variable to the next
entry in the list but it doesn't work out that way at all - the for
loop restarts from the beginning - this is where we get into an
infinite loop condition. Whether the test case I posted before is
valid or not, I reckon it shouldn't be possible for non-root Joe user
to contrive a test case which can hang the system as we're observing
with that test case. The above patch fixes the hang.
Comments greatly appreciated...
--
Regards,
George
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
- Follow-Ups:
- Re: [RFC][PATCH] Fix hang in posix_locks_deadlock()
- From: J. Bruce Fields
- Re: [RFC][PATCH] Fix hang in posix_locks_deadlock()
- References:
- [RFC][PATCH] Fix hang in posix_locks_deadlock()
- From: George G. Davis
- Re: [RFC][PATCH] Fix hang in posix_locks_deadlock()
- From: George G. Davis
- [RFC][PATCH] Fix hang in posix_locks_deadlock()
- Prev by Date: Re: Is gcc thread-unsafe?
- Next by Date: Re: Regression: commit "ide: constify struct ide_port_info" causes breakage
- Previous by thread: Re: [RFC][PATCH] Fix hang in posix_locks_deadlock()
- Next by thread: Re: [RFC][PATCH] Fix hang in posix_locks_deadlock()
- Index(es):
Relevant Pages
|