"CIFS VFS: server not responding" with some client/server combinations



I have the following, very strange problem, which I'll try to describe now.

I've already sent this to the samba mailing list (about a week ago),
but haven't got a reply. As it might be kernel related and others
may be affected too, I decided to send it to this list, too.

A Samba server running Debian Etch with latest updates.

Different clients one running Etch with latest updates, too. The other one runs
either Ubuntu (don't know which version because it's not my installation, but quite recent) or
the older Debian 3 but with a recent vanilla kernel (either 2.6.20.1 or 2.6.22.10).

Ok. If I mount a share from the server via cifs on either ubuntu or debian 3.0 and do a simple
find /sharemount
everything seems to be okay, at first. after some lines (didn't count them, but the amount seems
to vary from try to try and with some I mean maybe some 100) it starts to stuck. Sometimes for 1 second
sometimes more and quite often I get "CIFS VFS: server not responding".
together with "CIFS VFS: No response for cmd 50 mid xxx" The xxx number seems to vary.

The problem seems to be triggered by any operation which touches lots of files quickly. E.g.
by copying source file directories onto or from it or simply by find.

Copying single larger files (I generated a 100MB file with dd) don't seems to hurt.

Interestinly -- this is where the really strange part comes -- it seems to work on the client which
has debian Etch installed.

I've tried the following combinations

System | Success?
--------------------------------------+---------
Ubuntu with vmlinuz-2.6.20-16-generic | NO
on Client 1 |
--------------------------------------+---------
Debian 3.0 2.6.20.1 Client 1 | NO
--------------------------------------+---------
Debian 3.0 2.6.22.10 Client 1 | NO
--------------------------------------+---------
Debian 4.0 with 2.6.18-5-486 2 | YES (Why?)
--------------------------------------+---------
Debian 3.0 (don't remember the kernel | NO
but something > 2.6.10) on Client 3 |

Well. You see, I've tried lots of combinations. I don't know what's different with Etch on the client,
but either the problem is very widespread across various kernel versions or the problem is server related.

The server side output in the logfile is the following:

[2007/10/12 21:03:41, 1] smbd/service.c:close_cnum(1150)
192.168.0.9 (192.168.0.9) closed connection to service test
[2007/10/12 21:03:41, 1] smbd/service.c:make_connection_snum(950)
192.168.0.9 (192.168.0.9) connect to service test initially as user foobar (uid=10000, gid=10000) (pid 16721)

I tried to increase the loglevel to 5 but the logfile gets flooded with other messages so I won't post
it here (Very long!). But I can put it onto my webspace and post an URL if you think it helps.

Before I forget:
Samba Version 3.0.24 on the server, according to smbd -V.
As mount helper I use mount.cifs, compiled from samba-3.0.26a.

The fstab lines used to mount the shares look like that:
//192.168.0.12/gstorage /cifs/gstorage/ cifs noauto,credentials=/root/gstorage.cred 0 0

Additionally, I can't reproduce the problem with a Win2k client by e.g. using the search utility, like on Linux.

Very interesting is the following: If I experience a hang and produce some other network traffic on this client
by either pinging another host or typing something into an ssh session it continues immediately.
(E.g. if ping sends out some packets every second, the output will continue once a second.)
Note that I've tried this only at one client/kernel, don't know if the others behave the same way

If I should report this to the kernel mailing list, let me know, but you might know better where it belongs.

mfg Wiesner

PS: Please CC me, as I'm not on the list.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • NFS problems with through 2.5.x to 2.6.0-test9
    ... When the server is running the ... kernel, as a client the 2.6 series seem to work perfectly, excluding ... Interesting problem arose when I attempted switch the server's kernel to ... with and without nfsv4 support compiled in (was considering testing it at ...
    (Linux-Kernel)
  • Re: problems with latest smbfs changes on 2.4.34 and security backports
    ... something different on my samba server or similar, ... the client is an IBM NetworkStation 1000 (a PowerPC ... seems a security problem and I don't think is desirable, ... It is possible that you booted a wrong kernel during one of your tests. ...
    (Linux-Kernel)
  • Re: Using COM OutProc server on WinCE
    ... nk.exe (kernel process) and you can't popup UI from nk.exe in CE 6.0. ... Microsoft Corporation ... to the client so I guess in that scenario services.exe won't be helpful. ... Server (you'd need SYSGEN_DCOM, see ...
    (microsoft.public.windowsce.app.development)
  • Re: NFS EINVAL on open(... | O_TRUNC) on 2.6.23.9
    ... Thanks to Chuck's help i finally decided to proceed to a git bisect and found the bad patch. ... What isn't quite clear to me is whether this commit causes your user- space server to start failing suddenly, or it causes the client to start sending the special non-standard time stamps in the SETATTR request. ... it would be helpful if you could run this test with a constant kernel version on one side while varying it on the other. ...
    (Linux-Kernel)
  • Extremely frustrated w/Debian
    ... I am having some major issues with Debian and Linux in general at the ... Here are the main details of the server: ... The Debian kernel was 2.4.18, ... detecting the SCSI controller. ...
    (alt.linux)