Re: NFS has a problem



Mark Hobley wrote:
Dances With Crows <danceswithcrows@xxxxxxx> wrote:

If machine N has an NFS filesystem mounted, and the NFS server providing
that filesystem barfs, all the processes on N that were using that filesystem will get stuck in state D the next time they try to access that filesystem.
(Unless machine N mounted the filesystem with "soft", of course.)

Hmmm, the servers are running Debian stable, and have been providing NFS services for several years now, and have been working fine with Debian stable clients.

The Gentoo machine is a new development on the network, with the NFS client becoming operational at the beginning of this year.

Following a kernel upgrade last week, NFS now has a problem.

I can navigate round the mounted filesystem, changing directories, etc, but if I try to cat a file, sometimes I get no output, and other times the process gets stuck in state D.

I try to unmount the NFS filesystem, I get an error:

umount: /volumes/vol3a: device is busy

I stop and restart the nfs services on the server side, but client processes remain in state D on the Gentoo machine.

I examine the NFS server side dmesg:

nfsd: last server has exited
nfsd: unexporting all filesystems
RPC: failed to contact portmap (errno -5).
NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
NFSD: starting 90-second grace period

I don't know what that RPC: failed to contact portmap (errno -5), portmap is running, and my other NFS clients are working fine.

On the Gentoo client side, dmesg reveals:

RPC: Registered udp transport module.
RPC: Registered tcp transport module.
nfs: server neptune not responding, still trying
nfs: server neptune not responding, still trying
nfs: server neptune OK
nfs: server neptune OK

ps reveals that portmap is running on the client side.

I am running the nfs-kernel-server on the server side.

I suspect that the new kernel version has introduced some incompatibility with existing NFS servers, but I don't know how to go about troubleshooting or diagnosing this.


Sounds hauntingly familiar from long-ago remembered experiences with NFS. Buffer overruns..UDP packet sizes? something like that. Did NFS go TCP as well? if so there have been issues with larger window sizes on TCP..

I am willing to bet this is at that sort of level.

Mark.

.



Relevant Pages

  • Errors writing large files via NFS
    ... files larger than a certain size to a NFS server. ... client systems, although the definition of "too large" varies. ... network paths involved, I'm pretty sure we're not seeing a network problem. ...
    (Tru64-UNIX-Managers)
  • Re: Still getting NFS client locking up
    ... > the same NFS lockups. ... > Reading from the server works perfectly all the time. ... > NFS CLIENT: ... in particular, look at traces for any client blocked in NFS, ...
    (freebsd-current)
  • Re: Help me replace some Windows installations
    ... > Possible with untrusted clients in SMB, and trusted clients in NFS. ... >> trust every client that might be connected to this network. ... > Still, user ABC on client, accesses to server with rights of the user ... > which Peter already told you about, or use SMB for Linux to Linux ...
    (comp.os.linux.setup)
  • V210 BGE0@1000FDX
    ... When connecting a server to a Gig interface you need to enable autoneg ... Blocked port after process kill ... NFS oddity ... where hostname is the name of the NFS client which will automount the ...
    (SunManagers)
  • Re: [PATCH] VFS: new fgetattr() file operation
    ... We need this because fstat() semantics can in some cases be better ... implemented if the filesystem has the open file available. ... the server implemented as an unprivileged userspace process running on ... fstatis performed on open file descriptor on client ...
    (Linux-Kernel)