[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#800523: nscd netgroup cache occasionally not updated, nscd -i netgroup hangs



On 2015-09-30 12:29, Mike Gabriel wrote:
> Package: nscd
> Severity: important
> Version: 2.19-18+deb8u1
> Tags: patch
> Usertags: debian-edu
> User: debian-edu@lists.debian.org
> X-Debbugs-Cc: debian-edu@lists.debian.org
> 
> Dear maintainers,
> 
> the Debian Edu main server in jessie heavily relies on working netgroups
> code in glibc / nscd for allowing NFS access from hosts on the network.
> 
> The setup is:
> 
> /etc/nsswitch.conf:
> 
> """
> netgroup:    files ldap
> """
> 
> The LDAP nss provider is libnss-ldapd via nslcd. NIS netgroups are only
> configured in LDAP, no local /etc/netgroup file is present. NIS netgroup
> caching in nscd.conf is enabled.
> 
> In some cases (unclear what triggers it) the following is observed:
> 
>   o add a new host to the NIS netgroup "workstation-hosts"
>   o wait for a while (i.e., we even tried days...)
>   o "getent netgroup workstation-hosts" does not list the new host as
> netgroup member
>   o trying
> 
>     $ innetgr -h <recently-added-host> workstation-hosts || echo FALSE
> 
>     echoes "FALSE" on the terminal.
>   o sometimes there even is a difference between what getent netgroup
> <netgroup> gives
>     as a result and what innetgr returns as a result (a host tripled is
> listed in
>     getent netgroup <netgroup>, but when querying for that host via innetgr).
>   o Attempting cache clean-up (nscd -i netgroup) fails, the command hangs and
>     does not return to a command prompt
> 
> The behaviour occurs very often on Debian Edu jessie main server
> installations (and also on a vanilla Debian jessie server using a similar
> NIS netgroup / NFS setup). It does not occur always. Note, that I always
> have host netgroups that are full with host triplets (long strings!!!
> several lines on a normal 80x25 terminal).
> 
> From looking at debdiffs between glibc in unstable and jessie
> (2.19-18+deb8u1), the issue is probably also present in Debian unstable, but
> may have been fixed in glibc 2.21 (currently in experimental).

Given I don't have a test setup to reproduce the issue, and now that
2.21 is in testing, it would be nice if you can give a try with this
version to see if it improves things. That will at least tell us if we
have to look at patches to backports or at writing patches to fix the
issues.

> The debdiff between glibc in wheezy (2.13-38+deb7u8) and jessie
> (2.19-18+deb8u1) alludes that the changes around the netgroup caching code
> (there have been quite some nscd caching changes between those two version)
> may have caused this issue between glibc 2.13 and 2.19.
> 
> The above issue is definitely not present in glibc from Debian squeeze (we
> have many servers running that versions) and probably neither present in
> Debian wheezy (only one test server deployed), but really really bites us
> (the Debian Edu team) on Debian Edu jessie.
> 
> The workaround at the moment is: disable nscd netgroup caching in nscd.conf.
> This is by far suboptimal.
> 
> Upstream observed issues with (LDAP and) netgroup caching, as well, recently:
> 
>   https://sourceware.org/bugzilla/show_bug.cgi?id=16878
>   Patch: https://sourceware.org/git/gitweb.cgi?p=glibc.git;a=commitdiff;h=c3ec475c5dd16499aa040908e11d382c3ded9692;hp=aa2f176d6f75b86b91e544c2e494066ac8f88cbd

This has already been backported to jessie.

>   https://sourceware.org/bugzilla/show_bug.cgi?id=16760
> https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=dd3022d75e6fb8957843d6d84257a5d8457822d5

This one is actually from BZ 16759

> https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=ea7d8b95e2fcb81f68b04ed7787a3dbda023991a

It looks indeed a good idea to backport them.

>   https://sourceware.o rg/bugzilla/show_bug.cgi?id=16695
> https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=c44496df2f090a56d3bf75df930592dac6bba46f

This has already been backported to jessie.
 
>   https://sourceware.org/bugzilla/show_bug.cgi?id=16758
> https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=fbd6b5a4052316f7eb03c4617eebfaafc59dcc06
> 
> Especially BZ #16878 looks like being a good candidate to fix this. My
> recommendation is considering backporting all of the above patches as by
> reading those bug reports, glibc 2.19 seems quite buggy regarding netgroup
> caching in nscd.

I will try to backport them for the next point release.

Aurelien

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net

Attachment: signature.asc
Description: Digital signature


Reply to: