[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#248271: nscd freezes when used with libnss-ldap on busy server.



Subject: nscd freezes when used with libnss-ldap on busy server.
Package: nscd
Version: N/A; reported 2004-05-10
Severity: critical
Justification: breaks the whole system

IMHO what I describe here is a bug in nscd when using 
libnss-ldap. 
I've seen some old bug reports on debian and redhat 
with similar problems and some suggestions to not 
use nscd and libnss-ldap together.

We have a mail (postfix 2.1.0) server using debian 
Woody (all security updates made), kernel 2.4.26 
(but had the same problem with kernel 2.4.25).

We use libnss-ldap with local slapd server
(a replication of our primary ldap server)
for users' accounts.

so:

test: ~$head -3 /etc/nsswitch.conf 
passwd: 	ldap files
group: 		ldap files
shadow: 	ldap files

and 

test: ~$cat /etc/libnss-ldap.conf 
host 127.0.0.1
base ....
ldap_version 3

The problem is that, using nscd for password caching 
(default configuration) everything works fine, 
but the machine sometimes hangs.

Tests on another test-server have shown that
it can happen (randomly) when postfix has to deliver mail
to aliases with many (more than 100) local users.

The server then hangs (no connection possible even locally) 
but, if we were already logged, we can see the following behaviour:

Everything is fine with the exception of name resolving.
No respons from ls -l or ps -ef or any program
that needs accounts information.

test: ~$strace ls -l
fstat64(1, {st_mode=S_IFCHR|0600, st_rdev=makedev(136, 1), ...}) = 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40014000
write(1, "total 12\n", 9total 12
)               = 9
socket(PF_UNIX, SOCK_STREAM, 0)         = 3
connect(3, {sin_family=AF_UNIX, path="/var/run/.nscd_socket"}, 110) = 0
write(3, "\2\0\0\0\1\0\0\0\2\0\0\0", 12) = 12
write(3, "0\0", 2)                      = 2
read(3,

and nothing until Ctrl+C.

With options -b -l -n lsof works and 
lsof -b -l -n | grep .nscd_socket | wc -l

gives 121 opened files and 

test: ~$cat /proc/sys/fs/file-nr                       
4931	2562	52425

so the number of opened files should not be the problem.
( for  i in `pgrep nscd` ; do ls /proc/$i/fd/ | wc -l ; done
or for  i in `pgrep slapd` ; do ls /proc/$i/fd/ | wc -l ; done
show not too many files and this is not related to
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=246057).

The only way to restore the machine is to kill nscd or slapd.

Our only way to have a stable (up to now) server is NOT to use
nscd. 

I think this bug could also be considered a security problem
since it may lead to a local DoS.

	Ciao and thanks.

		Pietro

	
-- System Information
Debian Release: 3.0
Architecture: i386
Kernel: Linux test 2.4.26 #3 SMP Tue Apr 27 15:53:14 CEST 2004 i686 unknown
Locale: LANG=POSIX, LC_CTYPE=POSIX



Reply to: