Bug#1035909: nfs-utils: startup race with DNS resolution causes id mapping to (silently) use a bogus domain
Package: libnfsidmap1
Version: 1:2.6.2-4
Severity: important
Tags: upstream
X-Debbugs-Cc: debian@aram.nubmail.ca
Dear Maintainer,
If no default "Domain" is specified in /etc/idmapd.conf, idmapd, via
libnfsidmap will use "the domain part of the DNS domain". From the
excerpt below, libnfsidmap queries the DNS server for the hostname and
uses the part after the first '.' in the h_name ("official name of
host"). (See code snippet #1 below). I'm not sure why exactly this is done,
but my best guess is its to ensure the name used by the DNS server is used,
resolving any aliases, and/or a non-FQDN hostname.
If DNS resolution is not up, however, it quietly falls back to
IDMAPD_DEFAULT_DOMAIN, which is usually "localdomain", breaking id
mapping. This is related to Bug#1035840, in which the nfs-idmapd systemd
service does not by default wait for the network to be up.
Even when that is addressed, though, it seems DNS resolution is not
quite up and the gethostbyname() call fails. In my setup, adding a few
hundred ms delay in the service startup adresses the issue.
There is a message when this happens (see code snippet #2), but with a
log level of 1, it's not shown unless -v is passed. At the very least, I
think this error message needs to be log level 0.
To me, though, this scenario constitutes a failure. I would prefer that the
daemon fail to start rather than fall back to a hardcoded domain.
The second issue is the race against DNS resolution. I'm not sure why
the gethostbyname() call fails even when the interface is up (at least
according to ifup/systemd), but I think a better, or at least a first
fallback approach should be to use the domain part of the hostname, if
it exists. I'm guessing in many cases, mine included, this would
suffice.
These problems exist upstream, and they also exist in the libnfsidmap2
package used by Debian 11.
I sent an email about this to the nfs mailing list with more info about
the race overall as well as the context, but got no responses (see https://marc.info/?l=linux-nfs&m=167834665013860&w=2).
I'm hoping someone on the Debian team can point me in the right direction in terms of whether this fix is
appropriate and how to submit a patch upstream.
Thanks,
Aram
Code Snippet #1:
support/nfsidmap/libnfsidmap.c
215 static int domain_from_dns(char **domain)
216 {
217 struct hostent *he;
218 char hname[64], *c;
219
220 if (gethostname(hname, sizeof(hname)) == -1)
221 return -1;
222 if ((he = gethostbyname(hname)) == NULL)
223 return -1;
224 if ((c = strchr(he->h_name, '.')) == NULL || *++c == '\0')
225 return -1;
226 /*
227 * Query DNS to see if the _nfsv4idmapdomain TXT record exists
228 * If so use it...
229 */
230 if (dns_txt_query(c, domain) < 0)
231 *domain = strdup(c);
232
233 return 0;
234 }
Code Snippet #2:
support/nfsidmap/libnfsidmap.c
388 ret = domain_from_dns(&default_domain);
389 if (ret) {
390 IDMAP_LOG(1, ("libnfsidmap: Unable to determine "
391 "the NFSv4 domain; Using '%s' as the NFSv4 domain "
392 "which means UIDs will be mapped to the 'Nobody-User' "
393 "user defined in %s",
394 IDMAPD_DEFAULT_DOMAIN, PATH_IDMAPDCONF));
395 default_domain = IDMAPD_DEFAULT_DOMAIN;
396 }
-- System Information:
Debian Release: 11.4
APT prefers stable-updates
APT policy: (500, 'stable-updates'), (500, 'stable-security'), (500, 'testing'), (500, 'stable')
Architecture: amd64 (x86_64)
Kernel: Linux 5.10.0-16-amd64 (SMP w/4 CPU threads)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled
Versions of packages libnfsidmap1 depends on:
ii libc6 2.36-9
ii libldap-2.5-0 2.5.13+dfsg-5
libnfsidmap1 recommends no packages.
libnfsidmap1 suggests no packages.
-- no debconf information
Reply to: