AW: Problem with SMB mounts and Kernel 2.6.x

From: Markus Wollny (Markus.Wollny_at_computec.de)
Date: 01/24/05

  • Next message: Chris Lale: "Re: (CONFIRMED SOLUTION) Synaptic Package Manager Missing Buttons"
    Date: Mon, 24 Jan 2005 11:08:20 +0100
    To: <debian-user@lists.debian.org>
    
    

    Can nobody reproduce this (see below) or give any advice? Shall I file a
    bug?

    > Hi!
    >
    > I haven't found a current bug report yet, but before filing
    > one myself, I would like to confirm that it's not some exotic
    > misconfiguration on my behalf. I am running several Debian
    > boxes, which access some SMB-shares on other Linux-servers
    > and one Win2k-server. When receiving a certain amount of
    > load, sooner or later (mostly the former, i.e. within a few
    > minutes, sometimes seconds) the mounted smb-fileshares appear
    > to hang; if that is the case, neither top nor ps ax can run
    > without hanging as well. When trying to restart smbd, I
    > receive the notice that a process with a given id cannot be
    > terminated. Trying to kill -9 this process doesn't help, I
    > have to reboot the system in order to be able to unmount the
    > fileshare. Reading files on the affected shares is possible
    > without any hinderances, but writing involves a 30 second
    > wait (precisely 30 seconds). When I try to overwrite I file,
    > I get an I/O-error and the resulting file is empty:
    >
    > server-01:/path/to/share-01# date ; echo 1234 > test.txt ;
    > date; cat test.txt ; date ; echo 2345 > test.txt ; date; cat
    > test.txt; date Di Jan 18 12:07:34 CET 2005 Di Jan 18 12:08:04 CET
    > 2005 1234 Di Jan 18 12:08:04 CET 2005
    > -bash: test.txt: Eingabe-/Ausgabefehler
    > Di Jan 18 12:08:34 CET 2005
    > Di Jan 18 12:08:34 CET 2005
    >
    > touch gives me an I/O-error as well - after a 30 second wait period.
    >
    > The wait-period and I/O-errors apply to any fileshare which
    > is hosted on a linux box (tried with Samba 2.2.7a-SuSE and
    > Samba 3.0.10-Debian as hosts); there's no problem of that
    > kind when accessing fileshares on the Win2k server box. The
    > hanging processes problem under load however does affect the
    > Win2k-hosted share. I think that all of these issues are correlated.
    >
    > In syslog I find the following entries:
    > localhost kernel: smb_add_request: request [ce132e60,
    > mid=6328] timed out! (lots of these) localhost kernel:
    > smb_trans2: invalid data, disp=0, cnt=0, tot=0, ofs=0 (lots
    > of these as well) localhost kernel: smb_get_length: Invalid
    > NBT packet, code=fe localhost kernel: smb_get_length: Invalid
    > NBT packet, code=ff localhost kernel: smb_receive_header:
    > short packet: 0 localhost kernel: smb_receive_header: long
    > packet: 65628 localhost kernel: smb_proc_readX_data: offset
    > is larger than SMB_READX_MAX_PAD or negative!
    > localhost kernel: smb_proc_readX_data: -59 > 64 || -59 < 0
    >
    > I have tested with kernel 2.6.8-1-686-smp from sarge; The
    > test-system was a fresh sarge install. Downgrading to
    > kernel-image-2.4.27-2-686-smp (via apt-get install) resolved
    > the issue completely, the same applies for upgrading to
    > 2.6.10-1-686-smp from unstable (but I don't feel comfortable
    > enough with an "unstable" kernel on a production system).
    > Tested with both smbd version 3.0.7-Debian and 3.0.10-Debian.
    >
    > I have googled for the timeout-issue to some extent; some
    > suggested that the CIFS-code in 2.6 up to 2.6.9 was broken
    > regarding the unix extensions, but a fix would be included in
    > 2.6.10; using "unix extensions=no" in smb.conf was suggested.
    > I tried this smb.conf-setting, but the problem persisted.
    > Finally I got fed up with 2.6.8 and downgraded to 2.4 - and
    > that resolved it.
    >
    > I am still occasionally getting
    > Jan 21 13:43:06 localhost kernel: smb_trans2_request:
    > result=-104, setting invalid Jan 21 13:43:06 localhost
    > kernel: smb_retry: successful, new pid=1109,
    > generation=2
    > Jan 21 14:01:56 localhost kernel: smb_trans2_request:
    > result=-104, setting invalid Jan 21 14:01:56 localhost
    > kernel: smb_retry: successful, new pid=1109,
    > generation=3
    > in syslog, which is worrying me a bit, but I haven't noticed
    > anything bad in the actual operation of the servers after the
    > downgrade. I've yet to have a single smb_add_request: request
    > [whatever] timed out! with kernel 2.4.27-2-686-smp.
    >
    > Here are my smbd.conf and fstab-entries; I have replaced any
    > identifyable information with dummy-entries smb.conf [global]
    > workgroup = MYWRKGRP netbios name = DEBIAN-01
    > server string = Debian Testserver
    > security = SHARE
    > encrypt passwords = Yes
    > map to guest = Bad User
    > null passwords = Yes
    > log level = 1
    > syslog = 0
    > time server = Yes
    > unix extensions = no
    > socket options = SO_KEEPALIVE IPTOS_LOWDELAY TCP_NODELAY
    > os level = 2 default service = www
    > guest account = myuser
    > [www]
    > path = /var/www
    > read only = Yes
    > guest only = Yes
    > guest ok = Yes
    > hosts allow = All
    > nt acl support = No
    > hide dot files = No
    >
    > fstab:
    > //WINBOX/d$ /var/www/WINBOX smbfs
    > password=xxxx,username=Winuser,workgroup=MYWRKGRP,uid=500,gid=
    > 100,fmask= 666,dmask=777,rw 0 0
    >
    > I haven't yet found the time for more thorough tests with
    > kernel 2.6.10, but I shall be happy to do some more testing
    > by your instructions if it should be necessary. For my needs,
    > this bug (if it doesn't turn out to be a misconfiguration,
    > that is), is quite critical, i.e. I think it should
    > definitely be fixed before sarge becomes stable.
    >
    > Thank you very much for your help!
    >
    > Kind regards
    >
    > Markus


  • Next message: Chris Lale: "Re: (CONFIRMED SOLUTION) Synaptic Package Manager Missing Buttons"