[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#800523: nscd netgroup cache occasionally not updated, nscd -i netgroup hangs



Package: nscd
Severity: important
Version: 2.19-18+deb8u1
Tags: patch
Usertags: debian-edu
User: debian-edu@lists.debian.org
X-Debbugs-Cc: debian-edu@lists.debian.org

Dear maintainers,

the Debian Edu main server in jessie heavily relies on working netgroups code in glibc / nscd for allowing NFS access from hosts on the network.

The setup is:

/etc/nsswitch.conf:

"""
netgroup:    files ldap
"""

The LDAP nss provider is libnss-ldapd via nslcd. NIS netgroups are only configured in LDAP, no local /etc/netgroup file is present. NIS netgroup caching in nscd.conf is enabled.

In some cases (unclear what triggers it) the following is observed:

  o add a new host to the NIS netgroup "workstation-hosts"
  o wait for a while (i.e., we even tried days...)
o "getent netgroup workstation-hosts" does not list the new host as netgroup member
  o trying

    $ innetgr -h <recently-added-host> workstation-hosts || echo FALSE

    echoes "FALSE" on the terminal.
o sometimes there even is a difference between what getent netgroup <netgroup> gives as a result and what innetgr returns as a result (a host tripled is listed in
    getent netgroup <netgroup>, but when querying for that host via innetgr).
  o Attempting cache clean-up (nscd -i netgroup) fails, the command hangs and
    does not return to a command prompt

The behaviour occurs very often on Debian Edu jessie main server installations (and also on a vanilla Debian jessie server using a similar NIS netgroup / NFS setup). It does not occur always. Note, that I always have host netgroups that are full with host triplets (long strings!!! several lines on a normal 80x25 terminal).

From looking at debdiffs between glibc in unstable and jessie (2.19-18+deb8u1), the issue is probably also present in Debian unstable, but may have been fixed in glibc 2.21 (currently in experimental).

The debdiff between glibc in wheezy (2.13-38+deb7u8) and jessie (2.19-18+deb8u1) alludes that the changes around the netgroup caching code (there have been quite some nscd caching changes between those two version) may have caused this issue between glibc 2.13 and 2.19.

The above issue is definitely not present in glibc from Debian squeeze (we have many servers running that versions) and probably neither present in Debian wheezy (only one test server deployed), but really really bites us (the Debian Edu team) on Debian Edu jessie.

The workaround at the moment is: disable nscd netgroup caching in nscd.conf. This is by far suboptimal.

Upstream observed issues with (LDAP and) netgroup caching, as well, recently:

  https://sourceware.org/bugzilla/show_bug.cgi?id=16878
Patch: https://sourceware.org/git/gitweb.cgi?p=glibc.git;a=commitdiff;h=c3ec475c5dd16499aa040908e11d382c3ded9692;hp=aa2f176d6f75b86b91e544c2e494066ac8f88cbd

  https://sourceware.org/bugzilla/show_bug.cgi?id=16760
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=dd3022d75e6fb8957843d6d84257a5d8457822d5 https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=ea7d8b95e2fcb81f68b04ed7787a3dbda023991a

  https://sourceware.o rg/bugzilla/show_bug.cgi?id=16695
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=c44496df2f090a56d3bf75df930592dac6bba46f

  https://sourceware.o rg/bugzilla/show_bug.cgi?id=16758
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=fbd6b5a4052316f7eb03c4617eebfaafc59dcc06

Especially BZ #16878 looks like being a good candidate to fix this. My recommendation is considering backporting all of the above patches as by reading those bug reports, glibc 2.19 seems quite buggy regarding netgroup caching in nscd.

Greets,
Mike


--

DAS-NETZWERKTEAM
mike gabriel, herweg 7, 24357 fleckeby
fon: +49 (1520) 1976 148

GnuPG Key ID 0x25771B31
mail: mike.gabriel@das-netzwerkteam.de, http://das-netzwerkteam.de

freeBusy:
https://mail.das-netzwerkteam.de/freebusy/m.gabriel%40das-netzwerkteam.de.xfb

Attachment: pgpGdGxJEarlD.pgp
Description: Digitale PGP-Signatur


Reply to: