Re: nfs: infinite loop in fcntl(F_SETLKW)
- From: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx>
- Date: Thu, 10 Apr 2008 17:02:19 -0400
On Thu, 2008-04-10 at 21:51 +0200, Miklos Szeredi wrote:
Another infinite loop, this one involving both client and server.
Basically what happens is that on the server nlm_fopen() calls
nfsd_open() which returns -EACCES, to which nlm_fopen() returns
NLM_LCK_DENIED.
On the client this will turn into a -EAGAIN (nlm_stat_to_errno()),
which in will cause fcntl_setlk() to retry forever.
I _think_ the solution is to turn NLM_LCK_DENIED into ENOLCK for
blocking locks, as NLM_LCK_BLOCKED is for the contended case. For
testing the lock leave NLM_LCK_DENIED as EAGAIN. That still could be
misleading, but at least there's no infinite loop in that case.
I've minimally tested this patch to verify that it cures the lockup,
and that simple blocking locks keep working.
Signed-off-by: Miklos Szeredi <mszeredi@xxxxxxx>
---
fs/lockd/clntproc.c | 3 +++
1 file changed, 3 insertions(+)
Index: linux/fs/lockd/clntproc.c
===================================================================
--- linux.orig/fs/lockd/clntproc.c 2008-04-02 13:34:57.000000000 +0200
+++ linux/fs/lockd/clntproc.c 2008-04-10 21:23:46.000000000 +0200
@@ -536,6 +536,9 @@ again:
up_read(&host->h_rwsem);
}
status = nlm_stat_to_errno(resp->status);
+ /* Don't return EAGAIN, as that would make fcntl_setlk() loop */
+ if (status == -EAGAIN)
+ status = -ENOLCK;
out_unblock:
nlmclnt_finish_block(block);
/* Cancel the blocked request if it is still pending */
Wait. There is something really weird going on here.
According to the spec, LCK_DENIED means 'the request failed' (i.e.
ENOLCK is definitely correct)
OTOH, LCK_DENIED_NOLOCKS and LCK_DENIED_GRACE_PERIOD are both temporary
failures, the first because the server had a resource problem, and the
second because the server rebooted and is in the grace period (i.e.
EAGAIN would appear to be more appropriate). See
http://www.opengroup.org/onlinepubs/9629799/chap10.htm#tagcjh_11_02_02_02
AFAICS, the correct thing to do is to fix nlm_stat_to_errno() by
swapping the return values for NLM_LCK_DENIED and
NLM_LCK_DENIED_NOLOCKS/NLM_LCK_DENIED_GRACE_PERIOD.
The problem is that there appears to be a similar confusion on the Linux
server side in nlmsvc_lock(). :-(
--
Trond Myklebust
Linux NFS client maintainer
NetApp
Trond.Myklebust@xxxxxxxxxx
www.netapp.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
- Follow-Ups:
- Re: nfs: infinite loop in fcntl(F_SETLKW)
- From: Trond Myklebust
- Re: nfs: infinite loop in fcntl(F_SETLKW)
- References:
- [patch] fix infinite loop in generic_file_splice_read()
- From: Miklos Szeredi
- nfs: infinite loop in fcntl(F_SETLKW)
- From: Miklos Szeredi
- [patch] fix infinite loop in generic_file_splice_read()
- Prev by Date: Re: [PATCH 1/4] UIO: hold a reference to the device's owner while the device is open
- Next by Date: Re: 2.6.25-rc8: FTP transfer errors
- Previous by thread: nfs: infinite loop in fcntl(F_SETLKW)
- Next by thread: Re: nfs: infinite loop in fcntl(F_SETLKW)
- Index(es):
Relevant Pages
|
Loading