PROBLEM: Oops, nfsd, networking

From: Bill Vaughan (bvaughan_at_mindspring.com)
Date: 06/28/04

  • Next message: Ben Dooks: "Re: Kernel freezes- Init process in console driver"
    To: <linux-kernel@vger.kernel.org>
    Date:	Mon, 28 Jun 2004 08:18:05 -0400
    
    
    

    NFSD crash at large MTU and odd wsize/rsize.
     
    Hi,
     
    I have discovered a reproducible NFSD crash. My application is very obscure, so
    you may not be able to re-produce it. I am writing an NFS V3 client for an
    embedded project (our own OS). We are writing about 250 Mbps of media traffic
    over NFS to a Linux NFS server. At the same time I am reading about 70 Mbps
    sustained. Our product is essentially a high end media server that takes in up to
    2 Gbps of mpeg2 or mjpeg and load balances the traffic across several direct
    attached NFS servers.
     
    At this traffic level, I was getting a significant amount of traffic dropped by
    the driver/NIC on the Linux box. (underruns in ifconfig eth1, probably due to
    interrupt not being serviced). This was with a 9000 MTU and an 8192 wsize and
    rsize. What I did to reduce the interrupt rate is increase the MTU to 16000 (we
    are directly attached, so this is not an issue), and changed rsize and wsize to
    15500 (so it will fit in the MTU without frags). This cut the interrupts almost
    in half and greatly reduced the number of overruns. Everything was looking
    great. However, after about 10 minutes of sustained traffic (250 write, 70 read),
    the NFS server crashes (see dmesg). All 128 of them go away. One other note, I
    am retransmitting read requests that do not respond within 25 msecs. I do this
    because if a packet is dropped due to underruns, I cannot wait for a long
    time-out to re-transmit because my read rate will drop too low.
     
    I'll just have to go back to an MTU of 9000 and rsize/wsize of 8192 and just put
    more Linux servers behind us if I can't get this resolved.
     
    General:
    Linux: Fedora Core 2 with no updates (kernel 2.6.5)
    XFS file system.
    750 GB Software Raid0 over 3 drives.
    Hardware: Dual Xeon 1U rackmount with 4 250 GB SATA drives.
    Changed rmem_max and rmem_default to 262140 (see sysctl).
    Running on eth1 (Intel Gig interface). MTU set at 16000.
    Running 128 NFSD servers as NFSV3. WSIZE=15500, RSIZE=15500
     
     
    dmesg:
    Code: 8b 00 f6 c4 01 75 19 2b 1d 0c d9 3f 02 c1 fb 05 c1 e3 0c 8d
     <1>Unable to handle kernel NULL pointer dereference at virtual address 00000000
     
     printing eip:
    0213f951
    *pde = 00003001
    Oops: 0000 [#128]
    SMP
    CPU: 1
    EIP: 0060:[<0213f951>] Not tainted
    EFLAGS: 00010202 (2.6.5-1.358smp)
    EIP is at page_address+0x6/0x77
    eax: 00000000 ebx: 00000000 ecx: 00000000 edx: 764dfc00
    esi: 0000000b edi: 94171691 ebp: 764de000 esp: 76355f3c
    ds: 007b es: 007b ss: 0068
    Process nfsd (pid: 2375, threadinfo=76355000 task=813ad410)
    Stack: 003e06e8 0000000b 94171691 764de000 82b56b28 764dfc00 764dfc00 82b56a30
           82b6b138 82b6af1c 82b4c5a9 0e606014 764dfc64 764dfc00 82b6b138 0e606014
           82b0e5f1 764dfc00 0000003d 000000f4 00000190 000186a3 764dfc40 82b6af1c
    Call Trace:
     [<82b56b28>] nfs3svc_decode_writeargs+0xf8/0x149 [nfsd]
     [<82b56a30>] nfs3svc_decode_writeargs+0x0/0x149 [nfsd]
     [<82b4c5a9>] nfsd_dispatch+0x6f/0x162 [nfsd]
     [<82b0e5f1>] svc_process+0x323/0x55f [sunrpc]
     [<82b4c3d5>] nfsd+0x1c6/0x32b [nfsd]
     [<82b4c20f>] nfsd+0x0/0x32b [nfsd]
     [<021041f1>] kernel_thread_helper+0x5/0xb
     
    Code: 8b 00 f6 c4 01 75 19 2b 1d 0c d9 3f 02 c1 fb 05 c1 e3 0c 8d
     
    [root@localhost root]#
     
    eth0 Link encap:Ethernet HWaddr 00:0C:F1:DC:A0:A6
              inet addr:192.168.1.61 Bcast:192.168.1.255 Mask:255.255.255.0
              inet6 addr: fe80::20c:f1ff:fedc:a0a6/64 Scope:Link
              UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
              RX packets:3344 errors:0 dropped:0 overruns:0 frame:0
              TX packets:1591 errors:0 dropped:0 overruns:0 carrier:0
              collisions:0 txqueuelen:1000
              RX bytes:322870 (315.3 Kb) TX bytes:331102 (323.3 Kb)
     
    eth1 Link encap:Ethernet HWaddr 00:0C:F1:DC:A0:A3
              inet addr:13.0.0.26 Bcast:13.0.0.255 Mask:255.255.255.0
              inet6 addr: fe80::20c:f1ff:fedc:a0a3/64 Scope:Link
              UP BROADCAST RUNNING MULTICAST MTU:16000 Metric:1
              RX packets:5259815 errors:514953 dropped:514953 overruns:508817 frame:
    0
              TX packets:5172108 errors:0 dropped:0 overruns:0 carrier:0
              collisions:0 txqueuelen:1000
              RX bytes:4047084198 (3859.6 Mb) TX bytes:2817892640 (2687.3 Mb)
              Base address:0x9c00 Memory:fc9e0000-fca00000
     
    lo Link encap:Local Loopback
              inet addr:127.0.0.1 Mask:255.0.0.0
              inet6 addr: ::1/128 Scope:Host
              UP LOOPBACK RUNNING MTU:16436 Metric:1
              RX packets:1122 errors:0 dropped:0 overruns:0 frame:0
              TX packets:1122 errors:0 dropped:0 overruns:0 carrier:0
              collisions:0 txqueuelen:0
              RX bytes:1088676 (1.0 Mb) TX bytes:1088676 (1.0 Mb)

    [root@localhost root]# /usr/sbin/nfsstat -a
    Server packet stats:
    packets udp tcp tcpconn
    5172374 5172369 0 0
    Server rpc stats:
    calls badcalls badauth badclnt xdrcall
    5172362 0 0 0 0
    Server reply cache:
    hits misses nocache
    2 3941601 1230770
    Server file handle cache:
    lookup anon ncachedir ncachedir stale
    0 0 0 0 0
    Server nfs v2:
    null getattr setattr root lookup readlink
    0 0% 0 0% 0 0% 0 0% 0 0% 0 0%
    read wrcache write create remove rename
    0 0% 0 0% 0 0% 0 0% 0 0% 0 0%
    link symlink mkdir rmdir readdir fsstat
    0 0% 0 0% 0 0% 0 0% 0 0% 0 0%
     
    Server nfs v3:
    null getattr setattr lookup access readlink
    0 0% 0 0% 0 0% 49 0% 0 0% 0 0%
    read write create mkdir symlink mknod
    1230484 23% 3940984 76% 301 0% 0 0% 0 0% 0 0%
    remove rmdir rename link readdir readdirplus
    314 0% 0 0% 0 0% 0 0% 0 0% 97 0%
    fsstat fsinfo pathconf commit
    137 0% 3 0% 0 0% 0 0%

    [root@localhost root]# cat /proc/net/rpc/nfsd
    rc 2 3941601 1230770
    fh 0 0 0 0 0
    io 1889770924 3191644084
    th 128 174137 159.717 65.060 22.263 14.312 14.197 13.116 14.113 14.803 12.698 22
    3.272
    ra 256 1230468 0 0 0 0 0 0 0 0 0 19
    net 5172374 5172369 0 0
    rpc 5172362 0 0 0 0
    proc2 18 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    proc3 22 0 0 0 49 0 0 1230484 3940984 301 0 0 0 314 0 0 0 0 97 137 3 0 0
    proc4 2 0 0
     
    [root@localhost root]# cat /proc/cpuinfo
    processor : 0
    vendor_id : GenuineIntel
    cpu family : 15
    model : 2
    model name : Intel(R) Pentium(R) 4 CPU 3.20GHz
    stepping : 9
    cpu MHz : 3192.790
    cache size : 512 KB
    physical id : 0
    siblings : 2
    fdiv_bug : no
    hlt_bug : no
    f00f_bug : no
    coma_bug : no
    fpu : yes
    fpu_exception : yes
    cpuid level : 2
    wp : yes
    flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat
    pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe cid
    bogomips : 6324.22
     
    processor : 1
    vendor_id : GenuineIntel
    cpu family : 15
    model : 2
    model name : Intel(R) Pentium(R) 4 CPU 3.20GHz
    stepping : 9
    cpu MHz : 3192.790
    cache size : 512 KB
    physical id : 0
    siblings : 2
    fdiv_bug : no
    hlt_bug : no
    f00f_bug : no
    coma_bug : no
    fpu : yes
    fpu_exception : yes
    cpuid level : 2
    wp : yes
    flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat
    pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe cid
    bogomips : 6373.37

    [root@localhost root]# cat /proc/ioports
    0000-001f : dma1
    0020-0021 : pic1
    0040-005f : timer
    0060-006f : keyboard
    0070-0077 : rtc
    0080-008f : dma page reg
    00a0-00a1 : pic2
    00c0-00df : dma2
    00f0-00ff : fpu
    01f0-01f7 : ide0
    02f8-02ff : serial
    03c0-03df : vga+
    03f6-03f6 : ide0
    03f8-03ff : serial
    0cf8-0cff : PCI conf1
    9000-9fff : PCI Bus #02
      9c00-9c1f : 0000:02:01.0
        9c00-9c1f : e1000
    ac00-ac3f : 0000:03:08.0
      ac00-ac3f : e100
    b000-b07f : 0000:03:07.0
      b000-b07f : sata_promise
    b400-b40f : 0000:03:07.0
      b400-b40f : sata_promise
    b800-b8ff : 0000:03:06.0
    bc00-bc3f : 0000:03:07.0
      bc00-bc3f : sata_promise
    c800-c81f : 0000:00:1f.3
    cc00-cc1f : 0000:00:1d.0
      cc00-cc1f : uhci_hcd
    d000-d01f : 0000:00:1d.1
      d000-d01f : uhci_hcd
    d400-d41f : 0000:00:1d.2
      d400-d41f : uhci_hcd
    d800-d81f : 0000:00:1d.3
      d800-d81f : uhci_hcd
    dc00-dc0f : 0000:00:1f.2
      dc00-dc0f : libata
    e000-e003 : 0000:00:1f.2
      e000-e003 : libata
    e400-e407 : 0000:00:1f.2
      e400-e407 : libata
    e800-e803 : 0000:00:1f.2
      e800-e803 : libata
    ec00-ec07 : 0000:00:1f.2
      ec00-ec07 : libata
    ffa0-ffaf : 0000:00:1f.1
      ffa0-ffa7 : ide0
      ffa8-ffaf : ide1
     
    [root@localhost root]# cat /proc/iomem
    00000000-0009fbff : System RAM
    0009fc00-0009ffff : reserved
    000a0000-000bffff : Video RAM area
    000c0000-000c7fff : Video ROM
    000f0000-000fffff : System ROM
    00100000-7fe2ffff : System RAM
      00100000-002a2fff : Kernel code
      002a3000-003542ff : Kernel data
    7fe30000-7fe414a9 : ACPI Non-volatile Storage
    7fe414aa-7ff2ffff : System RAM
    7ff30000-7ff3ffff : ACPI Tables
    7ff40000-7ffeffff : ACPI Non-volatile Storage
    7fff0000-7fffffff : reserved
    80000000-800003ff : 0000:00:1f.1
    f8000000-fbffffff : 0000:00:00.0
    fc900000-fc9fffff : PCI Bus #02
      fc9e0000-fc9fffff : 0000:02:01.0
        fc9e0000-fc9fffff : e1000
    fd000000-fdffffff : 0000:03:06.0
    feaa0000-feabffff : 0000:03:07.0
      feaa0000-feabffff : sata_promise
    feafd000-feafdfff : 0000:03:08.0
      feafd000-feafdfff : e100
    feafe000-feafefff : 0000:03:07.0
      feafe000-feafefff : sata_promise
    feaff000-feafffff : 0000:03:06.0
    febffc00-febfffff : 0000:00:1d.7
      febffc00-febfffff : ehci_hcd
    fecf0000-fecf0fff : reserved
    fed20000-fed9ffff : reserved
     
    [root@localhost root]# cat /proc/scsi/scsi
    Attached devices:
    Host: scsi0 Channel: 00 Id: 00 Lun: 00
      Vendor: ATA Model: WDC WD2500JD-32H Rev: 1.02
      Type: Direct-Access ANSI SCSI revision: 05
    Host: scsi1 Channel: 00 Id: 00 Lun: 00
      Vendor: ATA Model: WDC WD2500JD-32H Rev: 1.02
      Type: Direct-Access ANSI SCSI revision: 05
    Host: scsi2 Channel: 00 Id: 00 Lun: 00
      Vendor: ATA Model: WDC WD2500JD-32H Rev: 1.02
      Type: Direct-Access ANSI SCSI revision: 05
    Host: scsi3 Channel: 00 Id: 00 Lun: 00
      Vendor: ATA Model: WDC WD2500JD-32H Rev: 1.02
      Type: Direct-Access ANSI SCSI revision: 05
     
     
    [root@localhost root]# cat /etc/fstab
    LABEL=/ / xfs defaults 1 1
    LABEL=/boot /boot ext3 defaults 1 2
    /dev/md0 /Raid0 xfs noatime 1 2
    none /dev/pts devpts gid=5,mode=620 0 0
    none /dev/shm tmpfs defaults 0 0
    none /proc proc defaults 0 0
    none /sys sysfs defaults 0 0
    /dev/sda3 swap swap defaults 0 0
    /dev/cdrom /mnt/cdrom udf,iso9660 noauto,owner,kudzu,r
    o 0 0
    /dev/fd0 /mnt/floppy auto noauto,owner,kudzu 0 0
     
    [root@localhost root]# cat /etc/exports
    /Raid0 *(no_wdelay,insecure,rw,async)
     
    -Bill Vaughan
    bvaughan@mindspring.com

    
    
    
    
    

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/



    • application/octet-stream attachment: nfs



  • Next message: Ben Dooks: "Re: Kernel freezes- Init process in console driver"

    Relevant Pages

    • [PATCH 5/5] NFS: Unify NFS superblocks per-protocol per-server [try #2]
      ... The attached patch makes NFS share superblocks between mounts from the same ... server over the same protocol. ... It does this by creating each superblock with a false root and returning the ... We may thus end up with several trees of dentries in the superblock, ...
      (Linux-Kernel)
    • [PATCH 5/6] NFS: Unify NFS superblocks per-protocol per-server [try #6]
      ... The attached patch makes NFS share superblocks between mounts from the same ... server over the same protocol. ... It does this by creating each superblock with a false root and returning the ... We may thus end up with several trees of dentries in the superblock, ...
      (Linux-Kernel)
    • [PATCH 5/6] NFS: Unify NFS superblocks per-protocol per-server [try #7]
      ... The attached patch makes NFS share superblocks between mounts from the same ... server over the same protocol. ... It does this by creating each superblock with a false root and returning the ... We may thus end up with several trees of dentries in the superblock, ...
      (Linux-Kernel)
    • [PATCH 5/5] NFS: Unify NFS superblocks per-protocol per-server [try #3a]
      ... The attached patch makes NFS share superblocks between mounts from the same ... server over the same protocol. ... It does this by creating each superblock with a false root and returning the ... We may thus end up with several trees of dentries in the superblock, ...
      (Linux-Kernel)
    • [PATCH 5/5] NFS: Unify NFS superblocks per-protocol per-server
      ... The attached patch makes NFS share superblocks between mounts from the same ... server over the same protocol. ... It does this by creating each superblock with a false root and returning the ... We may thus end up with several trees of dentries in the superblock, ...
      (Linux-Kernel)