Re: PROBLEM: Oops, nfsd, networking

From: Neil Brown (neilb_at_cse.unsw.edu.au)
Date: 06/30/04

  • Next message: Norbert Preining: "Re: 2.6.7-mm2, mmaps rework, buggy apps, setarch"
    To: <bvaughan@mindspring.com>, <linux-kernel@vger.kernel.org>
    Date:	Wed, 30 Jun 2004 17:04:33 +1000
    
    

    On Tuesday June 29, neilb@cse.unsw.edu.au wrote:
    > On Monday June 28, bvaughan@mindspring.com wrote:
    > > NFSD crash at large MTU and odd wsize/rsize.
    > >
    >
    ....
    >
    > Would you be able to change that code to:
    > while (len > args->vec[v].iov_len) {
    > len -= args->vec[v].iov_len;
    > v++;
    > if (rqstp->rq_argpages[v] == NULL) {
    > printk("NULL page at %d, c=%d al=%d l=%d pl=%d\n",
    > v, args->count, args->len, len, rqstp->rq_arg.len);
    > BUG();
    > }
    > args->vec[v].iov_base = page_address(rqstp->rq_argpages[v]);
    > args->vec[v].iov_len = PAGE_SIZE;
    > }
    >
    > (i.e. add the if() { printk..} bit).
    > and tell me what gets printed out?

    To which Bill Vaughan replied:

      Jun 29 12:20:09 localhost kernel: NULL page at 11, c=0 al=-802973147 l=-803029103 pl=15084

    Ok, that makes it easy, thanks. 'len' should be unsigned, and when we
    do
            if (len > NFSSVC_MAXBLKSIZE)
                    len = NFSSVC_MAXBLKSIZE;
    a very large, and hence negative, 'len' doesn't get corrected :-(

    So the following patch should fix the BUG, but it doesn't explain why
    'len' is such a large number. The request must be getting corrupted
    somewhere between the NFS client and the NFS server, possibly
    somewhere in the network stack of one or the other.

    Is there any chance of capturing a network trace (tcpdump -s 1000)
    when this problem occurs so we can see if there are any bad packets
    coming from the client, or if the packet is getting corrupted in the
    server?

    Thanks,
    NeilBrown

    ### Diffstat output
     ./fs/nfsd/nfs3xdr.c | 16 +++++++++++-----
     ./fs/nfsd/nfsxdr.c | 4 ++--
     2 files changed, 13 insertions(+), 7 deletions(-)

    diff ./fs/nfsd/nfs3xdr.c~current~ ./fs/nfsd/nfs3xdr.c
    --- ./fs/nfsd/nfs3xdr.c~current~ 2004-06-30 11:01:59.000000000 +1000
    +++ ./fs/nfsd/nfs3xdr.c 2004-06-30 11:14:01.000000000 +1000
    @@ -74,7 +74,7 @@ decode_fh(u32 *p, struct svc_fh *fhp)
     static inline u32 *
     encode_fh(u32 *p, struct svc_fh *fhp)
     {
    - int size = fhp->fh_handle.fh_size;
    + unsigned int size = fhp->fh_handle.fh_size;
             *p++ = htonl(size);
             if (size) p[XDR_QUADLEN(size)-1]=0;
             memcpy(p, &fhp->fh_handle.fh_base, size);
    @@ -328,7 +328,7 @@ int
     nfs3svc_decode_readargs(struct svc_rqst *rqstp, u32 *p,
                                             struct nfsd3_readargs *args)
     {
    - int len;
    + unsigned int len;
             int v,pn;
     
             if (!(p = decode_fh(p, &args->fh))
    @@ -358,7 +358,7 @@ int
     nfs3svc_decode_writeargs(struct svc_rqst *rqstp, u32 *p,
                                             struct nfsd3_writeargs *args)
     {
    - int len, v;
    + unsigned int len, v;
     
             if (!(p = decode_fh(p, &args->fh))
              || !(p = xdr_decode_hyper(p, &args->offset)))
    @@ -368,6 +368,12 @@ nfs3svc_decode_writeargs(struct svc_rqst
             args->stable = ntohl(*p++);
             len = args->len = ntohl(*p++);
     
    + if (rqstp->rq_arg.len <
    + len + ( (char*)p -
    + (char*)rqstp->rq_arg.head[0].iov_base)) {
    + return 0;
    + }
    +
             args->vec[0].iov_base = (void*)p;
             args->vec[0].iov_len = rqstp->rq_arg.head[0].iov_len -
                     (((void*)p) - rqstp->rq_arg.head[0].iov_base);
    @@ -427,7 +433,7 @@ int
     nfs3svc_decode_symlinkargs(struct svc_rqst *rqstp, u32 *p,
                                             struct nfsd3_symlinkargs *args)
     {
    - int len;
    + unsigned int len;
             int avail;
             char *old, *new;
             struct iovec *vec;
    @@ -444,7 +450,7 @@ nfs3svc_decode_symlinkargs(struct svc_rq
              */
             svc_take_page(rqstp);
             len = ntohl(*p++);
    - if (len <= 0 || len > NFS3_MAXPATHLEN || len >= PAGE_SIZE)
    + if (len == 0 || len > NFS3_MAXPATHLEN || len >= PAGE_SIZE)
                     return 0;
             args->tname = new = page_address(rqstp->rq_respages[rqstp->rq_resused-1]);
             args->tlen = len;

    diff ./fs/nfsd/nfsxdr.c~current~ ./fs/nfsd/nfsxdr.c
    --- ./fs/nfsd/nfsxdr.c~current~ 2004-06-30 11:14:16.000000000 +1000
    +++ ./fs/nfsd/nfsxdr.c 2004-06-30 11:14:30.000000000 +1000
    @@ -234,7 +234,7 @@ int
     nfssvc_decode_readargs(struct svc_rqst *rqstp, u32 *p,
                                             struct nfsd_readargs *args)
     {
    - int len;
    + unsigned int len;
             int v,pn;
             if (!(p = decode_fh(p, &args->fh)))
                     return 0;
    @@ -266,7 +266,7 @@ int
     nfssvc_decode_writeargs(struct svc_rqst *rqstp, u32 *p,
                                             struct nfsd_writeargs *args)
     {
    - int len;
    + unsigned int len;
             int v;
             if (!(p = decode_fh(p, &args->fh)))
                     return 0;

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/


  • Next message: Norbert Preining: "Re: 2.6.7-mm2, mmaps rework, buggy apps, setarch"

    Relevant Pages

    • Multiple TCP connections serving the same NFS mount?
      ... Has anyone seen or tried an NFS client which uses multiple parallel TCP connections for a single ... to an NFS server? ... Do you Yahoo!? ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • Re: bdflush/rpciod high CPU utilization, profile does not make sense
      ... > the behaviour identical same on both the PIII and the Opteron systems? ... The dual opteron is the nfs server ... The dual athlon is the 2.4 nfs client ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • Re: sent an invalid ICMP type 11, code 0 error to a broadcast: 0.0.0.0 on lo?
      ... > packets, ... route add -net 127.0.0.0 netmask 255.0.0.0 dev lo ... There should be no LAN routes going through lo. ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • slab corruption in test9 (NFS related?)
      ... I'm not quite sure whom to send this to, so my appologizes if you ... Two linux boxes, both running Debian Testing, both x86 with 2.6.0-test9. ... The nfs client was running the nhfsstone stress test against an ext3 NFS ... To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ ...
      (Linux-Kernel)
    • Re: [PATCH] Deadlock during heavy write activity to userspace NFS server on local NFS mount
      ... > userspace tasks, so you could probably get a patch accepted. ... that kswapd depends on) depends on kswapd itself. ... > But why is your NFS server needed to reclaim memory? ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)