[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

LDAP server scaling problem?



I am investigating a LDAP server failure where the LDAP server would
stop working at random times and cause all kind of problems on the
clients.  There are more than 300 clients on the site.

I remember Jose mentioning problems with slapd running out of file
descriptors in Spain, and started out investigating to see if this was
the problem here too.  I added two munin plugins to monitor the number
of open files in slapd and if slapd was working or not.  The plugins are
included below.

Google searches lead me to
<URL: http://www.openldap.org/lists/openldap-technical/201106/msg00031.html >
which report of a similar failure, and mention that adding a idle
timeout value might solve it.  I'm currently trying with "idletimeout
60" to see if it solve the problem.  It would solve the problem by
making sure clients are disconnected after a while, and thus not
accumulating file descriptors for the duration of the LDAP server life
time.

If I got the details right, a Linux process can by default only have
1024 files open.  This can be adjusted using 'ulimit -n', but I am not
sure if it will work with select() calls because of a hardcoded constant
in the system header files.

With 4 LDAP connections created by nslcd on each client, that would make
256 the ypper limit on the number of clients.  This is not really an
acceptable upper limit on the scalability, and any sites with more
clients would see random LDAP failures now and then.

Anyone else seen similar problems?  Is there any reason not to include
"idletimeout 60" or some other sensible timeout in our default
slapd.conf?

============ /etc/munin/plugins/open_slapd_files =======
#!/bin/sh
# 
# Plugin to monitor the number of open files in slapd
#
# Parameters:
# 
#       config   (required)
#       autoconf (optional - used by munin-config)
#
# Magic markers (Used by munin-config and some installation scripts.
# Optional):
#
#%# family=auto
#%# capabilities=autoconf

pid=$(pidof slapd)

if [ "$1" = "autoconf" ]; then
        if [ "$pid" ]; then
                echo yes
                exit 0
        else
                echo no
                exit 1
        fi
fi

if [ "$1" = "config" ]; then

        echo 'graph_title LDAP server slapd file table usage'
        echo 'graph_args --base 1000 -l 0'
        echo 'graph_vlabel number of open files'
        echo 'graph_category system'
        echo 'graph_info This graph monitors the slapd open files table.'
        echo 'used.label open files'
        echo 'used.info The number of currently open files.'
        exit 0
fi

printf "used.value "
ls /proc/$pid/fd|wc -l
=========== /etc/munin/plugins/open_slapd_working ======
#!/bin/sh
# 
# Plugin to monitor the number of open files in slapd
#
# Parameters:
# 
#       config   (required)
#       autoconf (optional - used by munin-config)
#
# Magic markers (Used by munin-config and some installation scripts.
# Optional):
#
#%# family=auto
#%# capabilities=autoconf

if [ "$1" = "autoconf" ]; then
        if [ 1 ]; then
                echo yes
                exit 0
        else
                echo no
                exit 1
        fi
fi

if [ "$1" = "config" ]; then

        echo 'graph_title LDAP server replying'
        echo 'graph_args --base 1000 -l 0'
        echo 'graph_vlabel true 1 or false 0'
        echo 'graph_category system'
        echo 'graph_info This graph replies from the slapd server.'
        echo 'working.label working'
        echo 'working.info Is the LDAP server working.'
        exit 0
fi

ldapserver=ldap

if ldapsearch -l 3 -LLL -h $ldapserver -x -b '' -s base > /dev/null 2>&1 ; then
        printf "working.value 1.0\n"
else
        printf "working.value 0.0\n"
fi
========================================================
-- 
Happy hacking
Petter Reinholdtsen


Reply to: