[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

AW: Problem with SMB mounts and Kernel 2.6.x



Can nobody reproduce this (see below) or give any advice? Shall I file a
bug?

> Hi!
> 
> I haven't found a current bug report yet, but before filing
> one myself, I would like to confirm that it's not some exotic
> misconfiguration on my behalf. I am running several Debian
> boxes, which access some SMB-shares on other Linux-servers
> and one Win2k-server. When receiving a certain amount of
> load, sooner or later (mostly the former, i.e. within a few
> minutes, sometimes seconds) the mounted smb-fileshares appear
> to hang; if that is the case, neither top nor ps ax can run
> without hanging as well. When trying to restart smbd, I
> receive the notice that a process with a given id cannot be
> terminated. Trying to kill -9 this process doesn't help, I
> have to reboot the system in order to be able to unmount the
> fileshare. Reading files on the affected shares is possible
> without any hinderances, but writing involves a 30 second
> wait (precisely 30 seconds). When I try to overwrite I file,
> I get an I/O-error and the resulting file is empty:
> 
> server-01:/path/to/share-01# date ; echo 1234 > test.txt ;
> date; cat test.txt ; date ; echo 2345 > test.txt ; date; cat
> test.txt; date Di Jan 18 12:07:34 CET 2005 Di Jan 18 12:08:04 CET
> 2005 1234 Di Jan 18 12:08:04 CET 2005
> -bash: test.txt: Eingabe-/Ausgabefehler
> Di Jan 18 12:08:34 CET 2005
> Di Jan 18 12:08:34 CET 2005
> 
> touch gives me an I/O-error as well - after a 30 second wait period.
> 
> The wait-period and I/O-errors apply to any fileshare which
> is hosted on a linux box (tried with Samba 2.2.7a-SuSE and
> Samba 3.0.10-Debian as hosts); there's no problem of that
> kind when accessing fileshares on the Win2k server box. The
> hanging processes problem under load however does affect the
> Win2k-hosted share. I think that all of these issues are correlated.
> 
> In syslog I find the following entries:
> localhost kernel: smb_add_request: request [ce132e60,
> mid=6328] timed out! (lots of these) localhost kernel:
> smb_trans2: invalid data, disp=0, cnt=0, tot=0, ofs=0 (lots
> of these as well) localhost kernel: smb_get_length: Invalid
> NBT packet, code=fe localhost kernel: smb_get_length: Invalid
> NBT packet, code=ff localhost kernel: smb_receive_header:
> short packet: 0 localhost kernel: smb_receive_header: long
> packet: 65628 localhost kernel: smb_proc_readX_data: offset
> is larger than SMB_READX_MAX_PAD or negative!
> localhost kernel: smb_proc_readX_data: -59 > 64 || -59 < 0
> 
> I have tested with kernel 2.6.8-1-686-smp from sarge; The
> test-system was a fresh sarge install. Downgrading to
> kernel-image-2.4.27-2-686-smp (via apt-get install) resolved
> the issue completely, the same applies for upgrading to
> 2.6.10-1-686-smp from unstable (but I don't feel comfortable
> enough with an "unstable" kernel on a production system).
> Tested with both smbd version 3.0.7-Debian and 3.0.10-Debian.
> 
> I have googled for the timeout-issue to some extent; some
> suggested that the CIFS-code in 2.6 up to 2.6.9 was broken
> regarding the unix extensions, but a fix would be included in
> 2.6.10; using "unix extensions=no" in smb.conf was suggested.
> I tried this smb.conf-setting, but the problem persisted.
> Finally I got fed up with 2.6.8 and downgraded to 2.4 - and
> that resolved it.
> 
> I am still occasionally getting
> Jan 21 13:43:06 localhost kernel: smb_trans2_request:
> result=-104, setting invalid Jan 21 13:43:06 localhost
> kernel: smb_retry: successful, new pid=1109,
> generation=2
> Jan 21 14:01:56 localhost kernel: smb_trans2_request:
> result=-104, setting invalid Jan 21 14:01:56 localhost
> kernel: smb_retry: successful, new pid=1109,
> generation=3
> in syslog, which is worrying me a bit, but I haven't noticed
> anything bad in the actual operation of the servers after the
> downgrade. I've yet to have a single smb_add_request: request
> [whatever] timed out! with kernel 2.4.27-2-686-smp.
> 
> Here are my smbd.conf and fstab-entries; I have replaced any
> identifyable information with dummy-entries smb.conf [global]        
>         workgroup = MYWRKGRP netbios name = DEBIAN-01
>         server string = Debian Testserver
>         security = SHARE
>         encrypt passwords = Yes
>         map to guest = Bad User
>         null passwords = Yes
>         log level = 1
>         syslog = 0
>         time server = Yes
>         unix extensions = no
>         socket options = SO_KEEPALIVE IPTOS_LOWDELAY TCP_NODELAY     
>         os level = 2 default service = www
>         guest account = myuser
> [www]
>         path = /var/www
>         read only = Yes
>         guest only = Yes
>         guest ok = Yes
>         hosts allow = All
>         nt acl support = No
>         hide dot files = No
> 
> fstab:
> //WINBOX/d$ /var/www/WINBOX smbfs
> password=xxxx,username=Winuser,workgroup=MYWRKGRP,uid=500,gid=
> 100,fmask= 666,dmask=777,rw 0 0
> 
> I haven't yet found the time for more thorough tests with
> kernel 2.6.10, but I shall be happy to do some more testing
> by your instructions if it should be necessary. For my needs,
> this bug (if it doesn't turn out to be a misconfiguration,
> that is), is quite critical, i.e. I think it should
> definitely be fixed before sarge becomes stable.
> 
> Thank you very much for your help!
> 
> Kind regards
> 
>    Markus




Reply to: