Programs failing with assertion error on machines using LDAP.
Hi All,
After rolling out LDAP and upgrading to kernel 2.6.10 we noticed programs were failing in a manner similar to this:
localuser1@machine1:~$ su -
Password:
su: pthread_mutex_lock.c:78: __pthread_mutex_lock: Assertion `mutex->__data.__owner == 0' failed.
Aborted
localuser1@machine1:~$
This only happens for local users su'ing to other local users, changing their passwords or logging in. Unfortunately we cannot find consistency to
the failures. Some things work where they don't on other machines. If we remove ldap from /etc/nsswitch.conf then everything works fine (except for
LDAP lookups, obviously). There are people reporting similar issues, see these posts:
http://lists.debian.org/debian-devel/2005/04/msg00304.html
http://article.gmane.org/gmane.comp.ldap.padl.nss/35
Generally, the solutions we have found are:
* Remove NPTL libraries from the system, eg. mv /lib/tls /lib/tls.old (not a solution to use on production boxes).
* Downgrade libnss_ldap to version 220 (not ideal but we are going with this for now).
* Upgrade glibc on the effected machines to 2.3.5, we got the idea from looking at the following output:
Debian Sarge Machine
--------------------
machine1:~# getconf GNU_LIBPTHREAD_VERSION
NPTL 0.60
machine1:~#
Fedora Core 3
-------------
[root@fedora1 ~]# getconf GNU_LIBPTHREAD_VERSION
NPTL 2.3.5
[root@fedora1 ~]#
I am not sure what the different versions mean exactly, but it seems that going from version 0.60 to 2.3.5 is a large jump. However, I suspecct that
NPTL may have had a version re-numbering to be consistent with the libc version it is built for, can someone confirm?
* Compiling libnss-ldap to not link to libpthread.so. When I used "apt-get source libnss-ldap", the source tarball gets patched with the file
libnss-ldap_238-1.diff.gz. This causes libnss_ldap.so to use libpthread.so. If I use the un-modified source tarball then it does not link to this
library and so everything works.
* Using LD_ASSUME_KERNEL=2.4.19
What we would like to know, is there a "proper" fix for this so that we can use the latest package available in Sarge and use NPTL?
Many many thanks,
Fred.
Reply to: