[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1053130: bookworm-pu: package glibc/2.36-9+deb12u2



Package: release.debian.org
Severity: normal
Tags: bookworm
User: release.debian.org@packages.debian.org
Usertags: pu
X-Debbugs-Cc: glibc@packages.debian.org, debian-glibc@lists.debian.org, debian-boot@lists.debian.org
Control: affects -1 + src:glibc

[ Reason ]
The upstream glibc stable branch got a few fixes since the latest point
released, including two security fixes.
 
[ Impact ]
Installations will be left vulnerable to security issues.

[ Tests ]
The upstream fixes come with additional tests, which represent a
significant part of the diff.

[ Risks ]
The risk can be considered low, as all the changes except the one for
CVE-2023-5156 have been tested in testing/sid for a few days. The one
for CVE-2023-5156 has just been uploaded to sid, but comes with a test.
In addition those fixes have been committed on a few upstream branches
and have been used by other distributions to provide security updates. 

[ Checklist ]
  [x] *all* changes are documented in the d/changelog
  [x] I reviewed all changes and I approve them
  [x] attach debdiff against the package in (old)stable
  [x] the issue is verified as fixed in unstable

[ Changes ]

All the changes come from the upstream stable branch, and are summarized
in the debian changelog. Let me comment it:

 - Fix the value of F_GETLK/F_SETLK/F_SETLKW with __USE_FILE_OFFSET64 on
   ppc64el.  Closes: #1050592.

This fixes a regression introduced in the previous point release and
testing/sid. On ppc64el, the values of F_GETLK/F_SETLK/F_SETLKW changed
when __USE_FILE_OFFSET64 is in use. While this is handled transparently
at the glibc level, it breaks some packages which use the values
internally like perl.

 - Fix a stack read overflow in getaddrinfo in no-aaaa mode
   (CVE-2023-4527).  Closes: #1051958.

This fixes a security issue in a new feature introduced in glibc 2.36,
which has not been considered serious enough by the security team to
issue a DSA.

 - Fix use after free in getcanonname (CVE-2023-4806, CVE-2023-5156).

This fixes a security issue that might happen with some NSS modules
which implement some hooks but not some others, however there are no
known modules implemented that way. Unfortunately the initial fix
introduced a memory leak which got assigned CVE-2023-5156.

 - Update the x86 cacheinfo code to look at the per-thread L3 cache to
   determine the non-temporal threshold. This improves memory and string
   functions on modern CPUs.

This changes the way the cache sizes are interpreted, properly taking
into account the L3 cache on modern CPUs. The memory and string
functions are unchanged, only some threshold are changed.

 - Fix _dl_find_object to return correct values even during early startup.

It has been found that _dl_find_object is can wrongly return 1 during
early startup. Currently no impact has been found, but as this functions
is used by some external unwiders (for instance GCC), it's better to fix
it to be future proof.

 - Always call destructors in reverse constructor order.

This fixes a regression introduced in glibc 2.36, which causes
destructors to be called in a different order than the constructors when
there are cyclic dependencies. This causes issues with some
applications.

[ Other info ]
debian-boot is in Cc: as glibc has one udeb.
diff --git a/debian/changelog b/debian/changelog
index aafd6e3a..146c85d3 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,3 +1,19 @@
+glibc (2.36-9+deb12u2) UNRELEASED; urgency=medium
+
+  * debian/patches/git-updates.diff: update from upstream stable branch:
+    - Fix the value of F_GETLK/F_SETLK/F_SETLKW with __USE_FILE_OFFSET64 on
+      ppc64el.  Closes: #1050592.
+    - Fix a stack read overflow in getaddrinfo in no-aaaa mode
+      (CVE-2023-4527).  Closes: #1051958.
+    - Fix use after free in getcanonname (CVE-2023-4806, CVE-2023-5156).
+    - Update the x86 cacheinfo code to look at the per-thread L3 cache to
+      determine the non-temporal threshold. This improves memory and string
+      functions on modern CPUs.
+    - Fix _dl_find_object to return correct values even during early startup.
+    - Always call destructors in reverse constructor order.
+
+ -- Aurelien Jarno <aurel32@debian.org>  Sat, 23 Sep 2023 15:08:08 +0200
+
 glibc (2.36-9+deb12u1) bookworm; urgency=medium
 
   [ Aurelien Jarno ]
diff --git a/debian/patches/git-updates.diff b/debian/patches/git-updates.diff
index 9203223b..cdb02b1d 100644
--- a/debian/patches/git-updates.diff
+++ b/debian/patches/git-updates.diff
@@ -68,10 +68,10 @@ index d1e139d03c..09c0cf8357 100644
  else	   					# -s
  verbose	:=
 diff --git a/NEWS b/NEWS
-index f61e521fc8..9f6b48b63d 100644
+index f61e521fc8..ae55ffb53a 100644
 --- a/NEWS
 +++ b/NEWS
-@@ -5,6 +5,65 @@ See the end for copying conditions.
+@@ -5,6 +5,85 @@ See the end for copying conditions.
  Please send GNU C library bug reports via <https://sourceware.org/bugzilla/>
  using `glibc' in the "product" field.
  
@@ -91,6 +91,21 @@ index f61e521fc8..9f6b48b63d 100644
 +  heap and prints it to the target log file, potentially revealing a
 +  portion of the contents of the heap.
 +
++  CVE-2023-4527: If the system is configured in no-aaaa mode via
++  /etc/resolv.conf, getaddrinfo is called for the AF_UNSPEC address
++  family, and a DNS response is received over TCP that is larger than
++  2048 bytes, getaddrinfo may potentially disclose stack contents via
++  the returned address data, or crash.
++
++  CVE-2023-4806: When an NSS plugin only implements the
++  _gethostbyname2_r and _getcanonname_r callbacks, getaddrinfo could use
++  memory that was freed during buffer resizing, potentially causing a
++  crash or read or write to arbitrary memory.
++
++  CVE-2023-5156: The fix for CVE-2023-4806 introduced a memory leak when
++  an application calls getaddrinfo for AF_INET6 with AI_CANONNAME,
++  AI_ALL and AI_V4MAPPED flags set.
++
 +The following bugs are resolved with this release:
 +
 +  [12154] Do not fail DNS resolution for CNAMEs which are not host names
@@ -133,6 +148,11 @@ index f61e521fc8..9f6b48b63d 100644
 +  [30163] posix: Fix system blocks SIGCHLD erroneously
 +  [30305] x86_64: Fix asm constraints in feraiseexcept
 +  [30477] libc: [RISCV]: time64 does not work on riscv32
++  [30515] _dl_find_object incorrectly returns 1 during early startup
++  [30785] Always call destructors in reverse constructor order
++  [30804] F_GETLK, F_SETLK, and F_SETLKW value change for powerpc64 with
++    -D_FILE_OFFSET_BITS=64
++  [30842] Stack read overflow in getaddrinfo in no-aaaa mode (CVE-2023-4527)
 +
  Version 2.36
  
@@ -283,10 +303,18 @@ index 2696dde4b1..9b07b4e132 100644
  
  void *
 diff --git a/elf/Makefile b/elf/Makefile
-index fd77d0c7c8..48788fcdb8 100644
+index fd77d0c7c8..30c9af1de9 100644
 --- a/elf/Makefile
 +++ b/elf/Makefile
-@@ -374,6 +374,8 @@ tests += \
+@@ -53,6 +53,7 @@ routines = \
+ # profiled libraries.
+ dl-routines = \
+   dl-call-libc-early-init \
++  dl-call_fini \
+   dl-close \
+   dl-debug \
+   dl-debug-symbols \
+@@ -374,6 +375,8 @@ tests += \
    tst-align \
    tst-align2 \
    tst-align3 \
@@ -295,7 +323,7 @@ index fd77d0c7c8..48788fcdb8 100644
    tst-audit1 \
    tst-audit2 \
    tst-audit8 \
-@@ -408,6 +410,7 @@ tests += \
+@@ -408,6 +411,7 @@ tests += \
    tst-dlmopen4 \
    tst-dlmopen-dlerror \
    tst-dlmopen-gethostbyname \
@@ -303,7 +331,7 @@ index fd77d0c7c8..48788fcdb8 100644
    tst-dlopenfail \
    tst-dlopenfail-2 \
    tst-dlopenrpath \
-@@ -631,6 +634,7 @@ ifeq ($(run-built-tests),yes)
+@@ -631,6 +635,7 @@ ifeq ($(run-built-tests),yes)
  tests-special += \
    $(objpfx)noload-mem.out \
    $(objpfx)tst-ldconfig-X.out \
@@ -311,7 +339,7 @@ index fd77d0c7c8..48788fcdb8 100644
    $(objpfx)tst-leaks1-mem.out \
    $(objpfx)tst-rtld-help.out \
    # tests-special
-@@ -765,6 +769,8 @@ modules-names += \
+@@ -765,6 +770,8 @@ modules-names += \
    tst-alignmod3 \
    tst-array2dep \
    tst-array5dep \
@@ -320,7 +348,7 @@ index fd77d0c7c8..48788fcdb8 100644
    tst-audit11mod1 \
    tst-audit11mod2 \
    tst-audit12mod1 \
-@@ -798,6 +804,7 @@ modules-names += \
+@@ -798,6 +805,7 @@ modules-names += \
    tst-auditmanymod7 \
    tst-auditmanymod8 \
    tst-auditmanymod9 \
@@ -328,7 +356,7 @@ index fd77d0c7c8..48788fcdb8 100644
    tst-auditmod1 \
    tst-auditmod9a \
    tst-auditmod9b \
-@@ -834,6 +841,8 @@ modules-names += \
+@@ -834,6 +842,8 @@ modules-names += \
    tst-dlmopen1mod \
    tst-dlmopen-dlerror-mod \
    tst-dlmopen-gethostbyname-mod \
@@ -337,7 +365,7 @@ index fd77d0c7c8..48788fcdb8 100644
    tst-dlopenfaillinkmod \
    tst-dlopenfailmod1 \
    tst-dlopenfailmod2 \
-@@ -990,23 +999,8 @@ modules-names += tst-gnu2-tls1mod
+@@ -990,23 +1000,8 @@ modules-names += tst-gnu2-tls1mod
  $(objpfx)tst-gnu2-tls1: $(objpfx)tst-gnu2-tls1mod.so
  tst-gnu2-tls1mod.so-no-z-defs = yes
  CFLAGS-tst-gnu2-tls1mod.c += -mtls-dialect=gnu2
@@ -362,7 +390,7 @@ index fd77d0c7c8..48788fcdb8 100644
  ifeq (yes,$(have-protected-data))
  modules-names += tst-protected1moda tst-protected1modb
  tests += tst-protected1a tst-protected1b
-@@ -2410,6 +2404,11 @@ $(objpfx)tst-ldconfig-X.out : tst-ldconfig-X.sh $(objpfx)ldconfig
+@@ -2410,6 +2405,11 @@ $(objpfx)tst-ldconfig-X.out : tst-ldconfig-X.sh $(objpfx)ldconfig
  		 '$(run-program-env)' > $@; \
  	$(evaluate-test)
  
@@ -374,7 +402,7 @@ index fd77d0c7c8..48788fcdb8 100644
  # Test static linking of all the libraries we can possibly link
  # together.  Note that in some configurations this may be less than the
  # complete list of libraries we build but we try to maxmimize this list.
-@@ -2967,3 +2966,25 @@ $(objpfx)tst-tls-allocation-failure-static-patched.out: \
+@@ -2967,3 +2967,25 @@ $(objpfx)tst-tls-allocation-failure-static-patched.out: \
  	grep -q '^Fatal glibc error: Cannot allocate TLS block$$' $@ \
  	  && grep -q '^status: 127$$' $@; \
  	  $(evaluate-test)
@@ -416,6 +444,548 @@ index 8bbf110d02..b97c17b3a9 100644
    return __strdup (temp);
  }
  
+diff --git a/elf/dl-call_fini.c b/elf/dl-call_fini.c
+new file mode 100644
+index 0000000000..9e7ba10fa2
+--- /dev/null
++++ b/elf/dl-call_fini.c
+@@ -0,0 +1,50 @@
++/* Invoke DT_FINI and DT_FINI_ARRAY callbacks.
++   Copyright (C) 1996-2022 Free Software Foundation, Inc.
++   This file is part of the GNU C Library.
++
++   The GNU C Library is free software; you can redistribute it and/or
++   modify it under the terms of the GNU Lesser General Public
++   License as published by the Free Software Foundation; either
++   version 2.1 of the License, or (at your option) any later version.
++
++   The GNU C Library is distributed in the hope that it will be useful,
++   but WITHOUT ANY WARRANTY; without even the implied warranty of
++   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
++   Lesser General Public License for more details.
++
++   You should have received a copy of the GNU Lesser General Public
++   License along with the GNU C Library; if not, see
++   <https://www.gnu.org/licenses/>.  */
++
++#include <ldsodefs.h>
++#include <sysdep.h>
++
++void
++_dl_call_fini (void *closure_map)
++{
++  struct link_map *map = closure_map;
++
++  /* When debugging print a message first.  */
++  if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_IMPCALLS))
++    _dl_debug_printf ("\ncalling fini: %s [%lu]\n\n", map->l_name, map->l_ns);
++
++  /* Make sure nothing happens if we are called twice.  */
++  map->l_init_called = 0;
++
++  ElfW(Dyn) *fini_array = map->l_info[DT_FINI_ARRAY];
++  if (fini_array != NULL)
++    {
++      ElfW(Addr) *array = (ElfW(Addr) *) (map->l_addr
++                                          + fini_array->d_un.d_ptr);
++      size_t sz = (map->l_info[DT_FINI_ARRAYSZ]->d_un.d_val
++                   / sizeof (ElfW(Addr)));
++
++      while (sz-- > 0)
++        ((fini_t) array[sz]) ();
++    }
++
++  /* Next try the old-style destructor.  */
++  ElfW(Dyn) *fini = map->l_info[DT_FINI];
++  if (fini != NULL)
++    DL_CALL_DT_FINI (map, ((void *) map->l_addr + fini->d_un.d_ptr));
++}
+diff --git a/elf/dl-close.c b/elf/dl-close.c
+index bcd6e206e9..640bbd88c3 100644
+--- a/elf/dl-close.c
++++ b/elf/dl-close.c
+@@ -36,11 +36,6 @@
+ 
+ #include <dl-unmap-segments.h>
+ 
+-
+-/* Type of the constructor functions.  */
+-typedef void (*fini_t) (void);
+-
+-
+ /* Special l_idx value used to indicate which objects remain loaded.  */
+ #define IDX_STILL_USED -1
+ 
+@@ -110,31 +105,6 @@ remove_slotinfo (size_t idx, struct dtv_slotinfo_list *listp, size_t disp,
+   return false;
+ }
+ 
+-/* Invoke dstructors for CLOSURE (a struct link_map *).  Called with
+-   exception handling temporarily disabled, to make errors fatal.  */
+-static void
+-call_destructors (void *closure)
+-{
+-  struct link_map *map = closure;
+-
+-  if (map->l_info[DT_FINI_ARRAY] != NULL)
+-    {
+-      ElfW(Addr) *array =
+-	(ElfW(Addr) *) (map->l_addr
+-			+ map->l_info[DT_FINI_ARRAY]->d_un.d_ptr);
+-      unsigned int sz = (map->l_info[DT_FINI_ARRAYSZ]->d_un.d_val
+-			 / sizeof (ElfW(Addr)));
+-
+-      while (sz-- > 0)
+-	((fini_t) array[sz]) ();
+-    }
+-
+-  /* Next try the old-style destructor.  */
+-  if (map->l_info[DT_FINI] != NULL)
+-    DL_CALL_DT_FINI (map, ((void *) map->l_addr
+-			   + map->l_info[DT_FINI]->d_un.d_ptr));
+-}
+-
+ void
+ _dl_close_worker (struct link_map *map, bool force)
+ {
+@@ -168,30 +138,31 @@ _dl_close_worker (struct link_map *map, bool force)
+ 
+   bool any_tls = false;
+   const unsigned int nloaded = ns->_ns_nloaded;
+-  struct link_map *maps[nloaded];
+ 
+-  /* Run over the list and assign indexes to the link maps and enter
+-     them into the MAPS array.  */
++  /* Run over the list and assign indexes to the link maps.  */
+   int idx = 0;
+   for (struct link_map *l = ns->_ns_loaded; l != NULL; l = l->l_next)
+     {
+       l->l_map_used = 0;
+       l->l_map_done = 0;
+       l->l_idx = idx;
+-      maps[idx] = l;
+       ++idx;
+     }
+   assert (idx == nloaded);
+ 
+-  /* Keep track of the lowest index link map we have covered already.  */
+-  int done_index = -1;
+-  while (++done_index < nloaded)
++  /* Keep marking link maps until no new link maps are found.  */
++  for (struct link_map *l = ns->_ns_loaded; l != NULL; )
+     {
+-      struct link_map *l = maps[done_index];
++      /* next is reset to earlier link maps for remarking.  */
++      struct link_map *next = l->l_next;
++      int next_idx = l->l_idx + 1; /* next->l_idx, but covers next == NULL.  */
+ 
+       if (l->l_map_done)
+-	/* Already handled.  */
+-	continue;
++	{
++	  /* Already handled.  */
++	  l = next;
++	  continue;
++	}
+ 
+       /* Check whether this object is still used.  */
+       if (l->l_type == lt_loaded
+@@ -201,7 +172,10 @@ _dl_close_worker (struct link_map *map, bool force)
+ 	     acquire is sufficient and correct.  */
+ 	  && atomic_load_acquire (&l->l_tls_dtor_count) == 0
+ 	  && !l->l_map_used)
+-	continue;
++	{
++	  l = next;
++	  continue;
++	}
+ 
+       /* We need this object and we handle it now.  */
+       l->l_map_used = 1;
+@@ -228,8 +202,11 @@ _dl_close_worker (struct link_map *map, bool force)
+ 			 already processed it, then we need to go back
+ 			 and process again from that point forward to
+ 			 ensure we keep all of its dependencies also.  */
+-		      if ((*lp)->l_idx - 1 < done_index)
+-			done_index = (*lp)->l_idx - 1;
++		      if ((*lp)->l_idx < next_idx)
++			{
++			  next = *lp;
++			  next_idx = next->l_idx;
++			}
+ 		    }
+ 		}
+ 
+@@ -249,54 +226,65 @@ _dl_close_worker (struct link_map *map, bool force)
+ 		if (!jmap->l_map_used)
+ 		  {
+ 		    jmap->l_map_used = 1;
+-		    if (jmap->l_idx - 1 < done_index)
+-		      done_index = jmap->l_idx - 1;
++		    if (jmap->l_idx < next_idx)
++		      {
++			  next = jmap;
++			  next_idx = next->l_idx;
++		      }
+ 		  }
+ 	      }
+ 	  }
+-    }
+ 
+-  /* Sort the entries.  We can skip looking for the binary itself which is
+-     at the front of the search list for the main namespace.  */
+-  _dl_sort_maps (maps, nloaded, (nsid == LM_ID_BASE), true);
++      l = next;
++    }
+ 
+-  /* Call all termination functions at once.  */
+-  bool unload_any = false;
+-  bool scope_mem_left = false;
+-  unsigned int unload_global = 0;
+-  unsigned int first_loaded = ~0;
+-  for (unsigned int i = 0; i < nloaded; ++i)
++  /* Call the destructors in reverse constructor order, and remove the
++     closed link maps from the list.  */
++  for (struct link_map **init_called_head = &_dl_init_called_list;
++       *init_called_head != NULL; )
+     {
+-      struct link_map *imap = maps[i];
+-
+-      /* All elements must be in the same namespace.  */
+-      assert (imap->l_ns == nsid);
++      struct link_map *imap = *init_called_head;
+ 
+-      if (!imap->l_map_used)
++      /* _dl_init_called_list is global, to produce a global odering.
++	 Ignore the other namespaces (and link maps that are still used).  */
++      if (imap->l_ns != nsid || imap->l_map_used)
++	init_called_head = &imap->l_init_called_next;
++      else
+ 	{
+ 	  assert (imap->l_type == lt_loaded && !imap->l_nodelete_active);
+ 
+-	  /* Call its termination function.  Do not do it for
+-	     half-cooked objects.  Temporarily disable exception
+-	     handling, so that errors are fatal.  */
+-	  if (imap->l_init_called)
+-	    {
+-	      /* When debugging print a message first.  */
+-	      if (__builtin_expect (GLRO(dl_debug_mask) & DL_DEBUG_IMPCALLS,
+-				    0))
+-		_dl_debug_printf ("\ncalling fini: %s [%lu]\n\n",
+-				  imap->l_name, nsid);
+-
+-	      if (imap->l_info[DT_FINI_ARRAY] != NULL
+-		  || imap->l_info[DT_FINI] != NULL)
+-		_dl_catch_exception (NULL, call_destructors, imap);
+-	    }
++	  /* _dl_init_called_list is updated at the same time as
++	     l_init_called.  */
++	  assert (imap->l_init_called);
++
++	  if (imap->l_info[DT_FINI_ARRAY] != NULL
++	      || imap->l_info[DT_FINI] != NULL)
++	    _dl_catch_exception (NULL, _dl_call_fini, imap);
+ 
+ #ifdef SHARED
+ 	  /* Auditing checkpoint: we remove an object.  */
+ 	  _dl_audit_objclose (imap);
+ #endif
++	  /* Unlink this link map.  */
++	  *init_called_head = imap->l_init_called_next;
++	}
++    }
++
+ 
++  bool unload_any = false;
++  bool scope_mem_left = false;
++  unsigned int unload_global = 0;
++
++  /* For skipping un-unloadable link maps in the second loop.  */
++  struct link_map *first_loaded = ns->_ns_loaded;
++
++  /* Iterate over the namespace to find objects to unload.  Some
++     unloadable objects may not be on _dl_init_called_list due to
++     dlopen failure.  */
++  for (struct link_map *imap = first_loaded; imap != NULL; imap = imap->l_next)
++    {
++      if (!imap->l_map_used)
++	{
+ 	  /* This object must not be used anymore.  */
+ 	  imap->l_removed = 1;
+ 
+@@ -307,8 +295,8 @@ _dl_close_worker (struct link_map *map, bool force)
+ 	    ++unload_global;
+ 
+ 	  /* Remember where the first dynamically loaded object is.  */
+-	  if (i < first_loaded)
+-	    first_loaded = i;
++	  if (first_loaded == NULL)
++	      first_loaded = imap;
+ 	}
+       /* Else imap->l_map_used.  */
+       else if (imap->l_type == lt_loaded)
+@@ -444,8 +432,8 @@ _dl_close_worker (struct link_map *map, bool force)
+ 	    imap->l_loader = NULL;
+ 
+ 	  /* Remember where the first dynamically loaded object is.  */
+-	  if (i < first_loaded)
+-	    first_loaded = i;
++	  if (first_loaded == NULL)
++	      first_loaded = imap;
+ 	}
+     }
+ 
+@@ -516,10 +504,11 @@ _dl_close_worker (struct link_map *map, bool force)
+ 
+   /* Check each element of the search list to see if all references to
+      it are gone.  */
+-  for (unsigned int i = first_loaded; i < nloaded; ++i)
++  for (struct link_map *imap = first_loaded; imap != NULL; )
+     {
+-      struct link_map *imap = maps[i];
+-      if (!imap->l_map_used)
++      if (imap->l_map_used)
++	imap = imap->l_next;
++      else
+ 	{
+ 	  assert (imap->l_type == lt_loaded);
+ 
+@@ -730,7 +719,9 @@ _dl_close_worker (struct link_map *map, bool force)
+ 	  if (imap == GL(dl_initfirst))
+ 	    GL(dl_initfirst) = NULL;
+ 
++	  struct link_map *next = imap->l_next;
+ 	  free (imap);
++	  imap = next;
+ 	}
+     }
+ 
+diff --git a/elf/dl-find_object.c b/elf/dl-find_object.c
+index 4d5831b6f4..2e5b456c11 100644
+--- a/elf/dl-find_object.c
++++ b/elf/dl-find_object.c
+@@ -46,7 +46,7 @@ _dl_find_object_slow (void *pc, struct dl_find_object *result)
+           struct dl_find_object_internal internal;
+           _dl_find_object_from_map (l, &internal);
+           _dl_find_object_to_external (&internal, result);
+-          return 1;
++          return 0;
+         }
+ 
+   /* Object not found.  */
+diff --git a/elf/dl-fini.c b/elf/dl-fini.c
+index 030b1fcbcd..50087a1bfc 100644
+--- a/elf/dl-fini.c
++++ b/elf/dl-fini.c
+@@ -21,155 +21,71 @@
+ #include <ldsodefs.h>
+ #include <elf-initfini.h>
+ 
+-
+-/* Type of the constructor functions.  */
+-typedef void (*fini_t) (void);
+-
+-
+ void
+ _dl_fini (void)
+ {
+-  /* Lots of fun ahead.  We have to call the destructors for all still
+-     loaded objects, in all namespaces.  The problem is that the ELF
+-     specification now demands that dependencies between the modules
+-     are taken into account.  I.e., the destructor for a module is
+-     called before the ones for any of its dependencies.
+-
+-     To make things more complicated, we cannot simply use the reverse
+-     order of the constructors.  Since the user might have loaded objects
+-     using `dlopen' there are possibly several other modules with its
+-     dependencies to be taken into account.  Therefore we have to start
+-     determining the order of the modules once again from the beginning.  */
+-
+-  /* We run the destructors of the main namespaces last.  As for the
+-     other namespaces, we pick run the destructors in them in reverse
+-     order of the namespace ID.  */
++  /* Call destructors strictly in the reverse order of constructors.
++     This causes fewer surprises than some arbitrary reordering based
++     on new (relocation) dependencies.  None of the objects are
++     unmapped, so applications can deal with this if their DSOs remain
++     in a consistent state after destructors have run.  */
++
++  /* Protect against concurrent loads and unloads.  */
++  __rtld_lock_lock_recursive (GL(dl_load_lock));
++
++  /* Ignore objects which are opened during shutdown.  */
++  struct link_map *local_init_called_list = _dl_init_called_list;
++
++  for (struct link_map *l = local_init_called_list; l != NULL;
++       l = l->l_init_called_next)
++      /* Bump l_direct_opencount of all objects so that they
++	 are not dlclose()ed from underneath us.  */
++      ++l->l_direct_opencount;
++
++  /* After this point, everything linked from local_init_called_list
++     cannot be unloaded because of the reference counter update.  */
++  __rtld_lock_unlock_recursive (GL(dl_load_lock));
++
++  /* Perform two passes: One for non-audit modules, one for audit
++     modules.  This way, audit modules receive unload notifications
++     for non-audit objects, and the destructors for audit modules
++     still run.  */
+ #ifdef SHARED
+-  int do_audit = 0;
+- again:
++  int last_pass = GLRO(dl_naudit) > 0;
++  Lmid_t last_ns = -1;
++  for (int do_audit = 0; do_audit <= last_pass; ++do_audit)
+ #endif
+-  for (Lmid_t ns = GL(dl_nns) - 1; ns >= 0; --ns)
+-    {
+-      /* Protect against concurrent loads and unloads.  */
+-      __rtld_lock_lock_recursive (GL(dl_load_lock));
+-
+-      unsigned int nloaded = GL(dl_ns)[ns]._ns_nloaded;
+-      /* No need to do anything for empty namespaces or those used for
+-	 auditing DSOs.  */
+-      if (nloaded == 0
+-#ifdef SHARED
+-	  || GL(dl_ns)[ns]._ns_loaded->l_auditing != do_audit
+-#endif
+-	  )
+-	__rtld_lock_unlock_recursive (GL(dl_load_lock));
+-      else
+-	{
+-#ifdef SHARED
+-	  _dl_audit_activity_nsid (ns, LA_ACT_DELETE);
+-#endif
+-
+-	  /* Now we can allocate an array to hold all the pointers and
+-	     copy the pointers in.  */
+-	  struct link_map *maps[nloaded];
+-
+-	  unsigned int i;
+-	  struct link_map *l;
+-	  assert (nloaded != 0 || GL(dl_ns)[ns]._ns_loaded == NULL);
+-	  for (l = GL(dl_ns)[ns]._ns_loaded, i = 0; l != NULL; l = l->l_next)
+-	    /* Do not handle ld.so in secondary namespaces.  */
+-	    if (l == l->l_real)
+-	      {
+-		assert (i < nloaded);
+-
+-		maps[i] = l;
+-		l->l_idx = i;
+-		++i;
+-
+-		/* Bump l_direct_opencount of all objects so that they
+-		   are not dlclose()ed from underneath us.  */
+-		++l->l_direct_opencount;
+-	      }
+-	  assert (ns != LM_ID_BASE || i == nloaded);
+-	  assert (ns == LM_ID_BASE || i == nloaded || i == nloaded - 1);
+-	  unsigned int nmaps = i;
+-
+-	  /* Now we have to do the sorting.  We can skip looking for the
+-	     binary itself which is at the front of the search list for
+-	     the main namespace.  */
+-	  _dl_sort_maps (maps, nmaps, (ns == LM_ID_BASE), true);
+-
+-	  /* We do not rely on the linked list of loaded object anymore
+-	     from this point on.  We have our own list here (maps).  The
+-	     various members of this list cannot vanish since the open
+-	     count is too high and will be decremented in this loop.  So
+-	     we release the lock so that some code which might be called
+-	     from a destructor can directly or indirectly access the
+-	     lock.  */
+-	  __rtld_lock_unlock_recursive (GL(dl_load_lock));
+-
+-	  /* 'maps' now contains the objects in the right order.  Now
+-	     call the destructors.  We have to process this array from
+-	     the front.  */
+-	  for (i = 0; i < nmaps; ++i)
+-	    {
+-	      struct link_map *l = maps[i];
+-
+-	      if (l->l_init_called)
+-		{
+-		  /* Make sure nothing happens if we are called twice.  */
+-		  l->l_init_called = 0;
+-
+-		  /* Is there a destructor function?  */
+-		  if (l->l_info[DT_FINI_ARRAY] != NULL
+-		      || (ELF_INITFINI && l->l_info[DT_FINI] != NULL))
+-		    {
+-		      /* When debugging print a message first.  */
+-		      if (__builtin_expect (GLRO(dl_debug_mask)
+-					    & DL_DEBUG_IMPCALLS, 0))
+-			_dl_debug_printf ("\ncalling fini: %s [%lu]\n\n",
+-					  DSO_FILENAME (l->l_name),
+-					  ns);
+-
+-		      /* First see whether an array is given.  */
+-		      if (l->l_info[DT_FINI_ARRAY] != NULL)
+-			{
+-			  ElfW(Addr) *array =
+-			    (ElfW(Addr) *) (l->l_addr
+-					    + l->l_info[DT_FINI_ARRAY]->d_un.d_ptr);
+-			  unsigned int i = (l->l_info[DT_FINI_ARRAYSZ]->d_un.d_val
+-					    / sizeof (ElfW(Addr)));
+-			  while (i-- > 0)
+-			    ((fini_t) array[i]) ();
+-			}
+-
+-		      /* Next try the old-style destructor.  */
+-		      if (ELF_INITFINI && l->l_info[DT_FINI] != NULL)
+-			DL_CALL_DT_FINI
+-			  (l, l->l_addr + l->l_info[DT_FINI]->d_un.d_ptr);
+-		    }
+-
++    for (struct link_map *l = local_init_called_list; l != NULL;
++	 l = l->l_init_called_next)
++      {
+ #ifdef SHARED
+-		  /* Auditing checkpoint: another object closed.  */
+-		  _dl_audit_objclose (l);
++	if (GL(dl_ns)[l->l_ns]._ns_loaded->l_auditing != do_audit)
++	  continue;
++
++	/* Avoid back-to-back calls of _dl_audit_activity_nsid for the
++	   same namespace.  */
++	if (last_ns != l->l_ns)
++	  {
++	    if (last_ns >= 0)
++	      _dl_audit_activity_nsid (last_ns, LA_ACT_CONSISTENT);
++	    _dl_audit_activity_nsid (l->l_ns, LA_ACT_DELETE);
++	    last_ns = l->l_ns;
++	  }
+ #endif
+-		}
+ 
+-	      /* Correct the previous increment.  */
+-	      --l->l_direct_opencount;
+-	    }
++	/* There is no need to re-enable exceptions because _dl_fini
++	   is not called from a context where exceptions are caught.  */
++	_dl_call_fini (l);
+ 
+ #ifdef SHARED
+-	  _dl_audit_activity_nsid (ns, LA_ACT_CONSISTENT);
++	/* Auditing checkpoint: another object closed.  */
++	_dl_audit_objclose (l);
+ #endif
+-	}
+-    }
++      }
+ 
+ #ifdef SHARED
+-  if (! do_audit && GLRO(dl_naudit) > 0)
+-    {
+-      do_audit = 1;
+-      goto again;
+-    }
++  if (last_ns >= 0)
++    _dl_audit_activity_nsid (last_ns, LA_ACT_CONSISTENT);
+ 
+   if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_STATISTICS))
+     _dl_debug_printf ("\nruntime linker statistics:\n"
 diff --git a/elf/dl-hwcaps.c b/elf/dl-hwcaps.c
 index 6f161f6ad5..92eb53790e 100644
 --- a/elf/dl-hwcaps.c
@@ -452,6 +1022,96 @@ index 6f161f6ad5..92eb53790e 100644
    struct r_strlenpair *overall_result
      = malloc (*sz * sizeof (*result) + total);
    if (overall_result == NULL)
+diff --git a/elf/dl-init.c b/elf/dl-init.c
+index deefeb099a..77b2edd838 100644
+--- a/elf/dl-init.c
++++ b/elf/dl-init.c
+@@ -21,14 +21,19 @@
+ #include <ldsodefs.h>
+ #include <elf-initfini.h>
+ 
++struct link_map *_dl_init_called_list;
+ 
+ static void
+ call_init (struct link_map *l, int argc, char **argv, char **env)
+ {
++  /* Do not run constructors for proxy objects.  */
++  if (l != l->l_real)
++    return;
++
+   /* If the object has not been relocated, this is a bug.  The
+      function pointers are invalid in this case.  (Executables do not
+-     need relocation, and neither do proxy objects.)  */
+-  assert (l->l_real->l_relocated || l->l_real->l_type == lt_executable);
++     need relocation.)  */
++  assert (l->l_relocated || l->l_type == lt_executable);
+ 
+   if (l->l_init_called)
+     /* This object is all done.  */
+@@ -38,6 +43,21 @@ call_init (struct link_map *l, int argc, char **argv, char **env)
+      dependency.  */
+   l->l_init_called = 1;
+ 
++  /* Help an already-running dlclose: The just-loaded object must not
++     be removed during the current pass.  (No effect if no dlclose in
++     progress.)  */
++  l->l_map_used = 1;
++
++  /* Record execution before starting any initializers.  This way, if
++     the initializers themselves call dlopen, their ELF destructors
++     will eventually be run before this object is destructed, matching
++     that their ELF constructors have run before this object was
++     constructed.  _dl_fini uses this list for audit callbacks, so
++     register objects on the list even if they do not have a
++     constructor.  */
++  l->l_init_called_next = _dl_init_called_list;
++  _dl_init_called_list = l;
++
+   /* Check for object which constructors we do not run here.  */
+   if (__builtin_expect (l->l_name[0], 'a') == '\0'
+       && l->l_type == lt_executable)
+diff --git a/elf/dl-load.c b/elf/dl-load.c
+index 1ad0868dad..cb59c21ce7 100644
+--- a/elf/dl-load.c
++++ b/elf/dl-load.c
+@@ -1263,7 +1263,7 @@ _dl_map_object_from_fd (const char *name, const char *origname, int fd,
+ 
+     /* Now process the load commands and map segments into memory.
+        This is responsible for filling in:
+-       l_map_start, l_map_end, l_addr, l_contiguous, l_text_end, l_phdr
++       l_map_start, l_map_end, l_addr, l_contiguous, l_phdr
+      */
+     errstring = _dl_map_segments (l, fd, header, type, loadcmds, nloadcmds,
+ 				  maplength, has_holes, loader);
+diff --git a/elf/dl-load.h b/elf/dl-load.h
+index f98d264e90..ebf7d74cd0 100644
+--- a/elf/dl-load.h
++++ b/elf/dl-load.h
+@@ -83,14 +83,11 @@ struct loadcmd
+ 
+ /* This is a subroutine of _dl_map_segments.  It should be called for each
+    load command, some time after L->l_addr has been set correctly.  It is
+-   responsible for setting up the l_text_end and l_phdr fields.  */
++   responsible for setting the l_phdr fields  */
+ static __always_inline void
+ _dl_postprocess_loadcmd (struct link_map *l, const ElfW(Ehdr) *header,
+                          const struct loadcmd *c)
+ {
+-  if (c->prot & PROT_EXEC)
+-    l->l_text_end = l->l_addr + c->mapend;
+-
+   if (l->l_phdr == 0
+       && c->mapoff <= header->e_phoff
+       && ((size_t) (c->mapend - c->mapstart + c->mapoff)
+@@ -103,7 +100,7 @@ _dl_postprocess_loadcmd (struct link_map *l, const ElfW(Ehdr) *header,
+ 
+ /* This is a subroutine of _dl_map_object_from_fd.  It is responsible
+    for filling in several fields in *L: l_map_start, l_map_end, l_addr,
+-   l_contiguous, l_text_end, l_phdr.  On successful return, all the
++   l_contiguous, l_phdr.  On successful return, all the
+    segments are mapped (or copied, or whatever) from the file into their
+    final places in the address space, with the correct page permissions,
+    and any bss-like regions already zeroed.  It returns a null pointer
 diff --git a/elf/dl-lookup.c b/elf/dl-lookup.c
 index 4c86dc694e..67fb2e31e2 100644
 --- a/elf/dl-lookup.c
@@ -674,20 +1334,34 @@ index e6a56b3070..9fa3b484cf 100644
 +  }
  }
 diff --git a/elf/dso-sort-tests-1.def b/elf/dso-sort-tests-1.def
-index 5f7f18ef27..4bf9052db1 100644
+index 5f7f18ef27..61dc54f8ae 100644
 --- a/elf/dso-sort-tests-1.def
 +++ b/elf/dso-sort-tests-1.def
-@@ -64,3 +64,10 @@ output: b>a>{}<a<b
+@@ -53,14 +53,14 @@ tst-dso-ordering10: {}->a->b->c;soname({})=c
+ output: b>a>{}<a<b
+ 
+ # Complex example from Bugzilla #15311, under-linked and with circular
+-# relocation(dynamic) dependencies. While this is technically unspecified, the
+-# presumed reasonable practical behavior is for the destructor order to respect
+-# the static DT_NEEDED links (here this means the a->b->c->d order).
+-# The older dynamic_sort=1 algorithm does not achieve this, while the DFS-based
+-# dynamic_sort=2 algorithm does, although it is still arguable whether going
+-# beyond spec to do this is the right thing to do.
+-# The below expected outputs are what the two algorithms currently produce
+-# respectively, for regression testing purposes.
++# relocation(dynamic) dependencies. For both sorting algorithms, the
++# destruction order is the reverse of the construction order, and
++# relocation dependencies are not taken into account.
  tst-bz15311: {+a;+e;+f;+g;+d;%d;-d;-g;-f;-e;-a};a->b->c->d;d=>[ba];c=>a;b=>e=>a;c=>f=>b;d=>g=>c
- output(glibc.rtld.dynamic_sort=1): {+a[d>c>b>a>];+e[e>];+f[f>];+g[g>];+d[];%d(b(e(a()))a()g(c(a()f(b(e(a()))))));-d[];-g[];-f[];-e[];-a[<a<c<d<g<f<b<e];}
- output(glibc.rtld.dynamic_sort=2): {+a[d>c>b>a>];+e[e>];+f[f>];+g[g>];+d[];%d(b(e(a()))a()g(c(a()f(b(e(a()))))));-d[];-g[];-f[];-e[];-a[<g<f<a<b<c<d<e];}
+-output(glibc.rtld.dynamic_sort=1): {+a[d>c>b>a>];+e[e>];+f[f>];+g[g>];+d[];%d(b(e(a()))a()g(c(a()f(b(e(a()))))));-d[];-g[];-f[];-e[];-a[<a<c<d<g<f<b<e];}
+-output(glibc.rtld.dynamic_sort=2): {+a[d>c>b>a>];+e[e>];+f[f>];+g[g>];+d[];%d(b(e(a()))a()g(c(a()f(b(e(a()))))));-d[];-g[];-f[];-e[];-a[<g<f<a<b<c<d<e];}
++output: {+a[d>c>b>a>];+e[e>];+f[f>];+g[g>];+d[];%d(b(e(a()))a()g(c(a()f(b(e(a()))))));-d[];-g[];-f[];-e[];-a[<g<f<e<a<b<c<d];}
 +
 +# Test that even in the presence of dependency loops involving dlopen'ed
 +# object, that object is initialized last (and not unloaded prematurely).
-+# Final destructor order is indeterminate due to the cycle.
++# Final destructor order is the opposite of constructor order.
 +tst-bz28937: {+a;+b;-b;+c;%c};a->a1;a->a2;a2->a;b->b1;c->a1;c=>a1
-+output(glibc.rtld.dynamic_sort=1): {+a[a2>a1>a>];+b[b1>b>];-b[<b<b1];+c[c>];%c(a1());}<a<a2<c<a1
-+output(glibc.rtld.dynamic_sort=2): {+a[a2>a1>a>];+b[b1>b>];-b[<b<b1];+c[c>];%c(a1());}<a2<a<c<a1
++output: {+a[a2>a1>a>];+b[b1>b>];-b[<b<b1];+c[c>];%c(a1());}<c<a<a1<a2
 diff --git a/elf/elf.h b/elf/elf.h
 index 02a1b3f52f..014393f3cc 100644
 --- a/elf/elf.h
@@ -720,10 +1394,44 @@ index ca00dd1fe2..3c5e273f2b 100644
  else						# -s
  verbose	:=
 diff --git a/elf/rtld.c b/elf/rtld.c
-index cbbaf4a331..3e771a93d8 100644
+index cbbaf4a331..dd45930ff7 100644
 --- a/elf/rtld.c
 +++ b/elf/rtld.c
-@@ -2122,6 +2122,12 @@ dl_main (const ElfW(Phdr) *phdr,
+@@ -479,7 +479,6 @@ _dl_start_final (void *arg, struct dl_start_final_info *info)
+   GL(dl_rtld_map).l_real = &GL(dl_rtld_map);
+   GL(dl_rtld_map).l_map_start = (ElfW(Addr)) &__ehdr_start;
+   GL(dl_rtld_map).l_map_end = (ElfW(Addr)) _end;
+-  GL(dl_rtld_map).l_text_end = (ElfW(Addr)) _etext;
+   /* Copy the TLS related data if necessary.  */
+ #ifndef DONT_USE_BOOTSTRAP_MAP
+ # if NO_TLS_OFFSET != 0
+@@ -1124,7 +1123,6 @@ rtld_setup_main_map (struct link_map *main_map)
+   bool has_interp = false;
+ 
+   main_map->l_map_end = 0;
+-  main_map->l_text_end = 0;
+   /* Perhaps the executable has no PT_LOAD header entries at all.  */
+   main_map->l_map_start = ~0;
+   /* And it was opened directly.  */
+@@ -1216,8 +1214,6 @@ rtld_setup_main_map (struct link_map *main_map)
+ 	  allocend = main_map->l_addr + ph->p_vaddr + ph->p_memsz;
+ 	  if (main_map->l_map_end < allocend)
+ 	    main_map->l_map_end = allocend;
+-	  if ((ph->p_flags & PF_X) && allocend > main_map->l_text_end)
+-	    main_map->l_text_end = allocend;
+ 
+ 	  /* The next expected address is the page following this load
+ 	     segment.  */
+@@ -1277,8 +1273,6 @@ rtld_setup_main_map (struct link_map *main_map)
+       = (char *) main_map->l_tls_initimage + main_map->l_addr;
+   if (! main_map->l_map_end)
+     main_map->l_map_end = ~0;
+-  if (! main_map->l_text_end)
+-    main_map->l_text_end = ~0;
+   if (! GL(dl_rtld_map).l_libname && GL(dl_rtld_map).l_name)
+     {
+       /* We were invoked directly, so the program might not have a
+@@ -2122,6 +2116,12 @@ dl_main (const ElfW(Phdr) *phdr,
  	    if (l->l_faked)
  	      /* The library was not found.  */
  	      _dl_printf ("\t%s => not found\n",  l->l_libname->name);
@@ -736,6 +1444,149 @@ index cbbaf4a331..3e771a93d8 100644
  	    else
  	      _dl_printf ("\t%s => %s (0x%0*Zx)\n",
  			  DSO_FILENAME (l->l_libname->name),
+diff --git a/elf/setup-vdso.h b/elf/setup-vdso.h
+index c0807ea82b..415d5057c3 100644
+--- a/elf/setup-vdso.h
++++ b/elf/setup-vdso.h
+@@ -51,9 +51,6 @@ setup_vdso (struct link_map *main_map __attribute__ ((unused)),
+ 		l->l_addr = ph->p_vaddr;
+ 	      if (ph->p_vaddr + ph->p_memsz >= l->l_map_end)
+ 		l->l_map_end = ph->p_vaddr + ph->p_memsz;
+-	      if ((ph->p_flags & PF_X)
+-		  && ph->p_vaddr + ph->p_memsz >= l->l_text_end)
+-		l->l_text_end = ph->p_vaddr + ph->p_memsz;
+ 	    }
+ 	  else
+ 	    /* There must be no TLS segment.  */
+@@ -62,7 +59,6 @@ setup_vdso (struct link_map *main_map __attribute__ ((unused)),
+       l->l_map_start = (ElfW(Addr)) GLRO(dl_sysinfo_dso);
+       l->l_addr = l->l_map_start - l->l_addr;
+       l->l_map_end += l->l_addr;
+-      l->l_text_end += l->l_addr;
+       l->l_ld = (void *) ((ElfW(Addr)) l->l_ld + l->l_addr);
+       elf_get_dynamic_info (l, false, false);
+       _dl_setup_hash (l);
+diff --git a/elf/tst-audit23.c b/elf/tst-audit23.c
+index 4904cf1340..f40760bd70 100644
+--- a/elf/tst-audit23.c
++++ b/elf/tst-audit23.c
+@@ -98,6 +98,8 @@ do_test (int argc, char *argv[])
+     char *lname;
+     uintptr_t laddr;
+     Lmid_t lmid;
++    uintptr_t cookie;
++    uintptr_t namespace;
+     bool closed;
+   } objs[max_objs] = { [0 ... max_objs-1] = { .closed = false } };
+   size_t nobjs = 0;
+@@ -117,6 +119,9 @@ do_test (int argc, char *argv[])
+   size_t buffer_length = 0;
+   while (xgetline (&buffer, &buffer_length, out))
+     {
++      *strchrnul (buffer, '\n') = '\0';
++      printf ("info: subprocess output: %s\n", buffer);
++
+       if (startswith (buffer, "la_activity: "))
+ 	{
+ 	  uintptr_t cookie;
+@@ -125,29 +130,26 @@ do_test (int argc, char *argv[])
+ 			  &cookie);
+ 	  TEST_COMPARE (r, 2);
+ 
+-	  /* The cookie identifies the object at the head of the link map,
+-	     so we only add a new namespace if it changes from the previous
+-	     one.  This works since dlmopen is the last in the test body.  */
+-	  if (cookie != last_act_cookie && last_act_cookie != -1)
+-	    TEST_COMPARE (last_act, LA_ACT_CONSISTENT);
+-
+ 	  if (this_act == LA_ACT_ADD && acts[nacts] != cookie)
+ 	    {
++	      /* The cookie identifies the object at the head of the
++		 link map, so we only add a new namespace if it
++		 changes from the previous one.  This works since
++		 dlmopen is the last in the test body.  */
++	      if (cookie != last_act_cookie && last_act_cookie != -1)
++		TEST_COMPARE (last_act, LA_ACT_CONSISTENT);
++
+ 	      acts[nacts++] = cookie;
+ 	      last_act_cookie = cookie;
+ 	    }
+-	  /* The LA_ACT_DELETE is called in the reverse order of LA_ACT_ADD
+-	     at program termination (if the tests adds a dlclose or a library
+-	     with extra dependencies this will need to be adapted).  */
++	  /* LA_ACT_DELETE is called multiple times for each
++	     namespace, depending on destruction order.  */
+ 	  else if (this_act == LA_ACT_DELETE)
+-	    {
+-	      last_act_cookie = acts[--nacts];
+-	      TEST_COMPARE (acts[nacts], cookie);
+-	      acts[nacts] = 0;
+-	    }
++	    last_act_cookie = cookie;
+ 	  else if (this_act == LA_ACT_CONSISTENT)
+ 	    {
+ 	      TEST_COMPARE (cookie, last_act_cookie);
++	      last_act_cookie = -1;
+ 
+ 	      /* LA_ACT_DELETE must always be followed by an la_objclose.  */
+ 	      if (last_act == LA_ACT_DELETE)
+@@ -179,6 +181,8 @@ do_test (int argc, char *argv[])
+ 	  objs[nobjs].lname = lname;
+ 	  objs[nobjs].laddr = laddr;
+ 	  objs[nobjs].lmid = lmid;
++	  objs[nobjs].cookie = cookie;
++	  objs[nobjs].namespace = last_act_cookie;
+ 	  objs[nobjs].closed = false;
+ 	  nobjs++;
+ 
+@@ -201,6 +205,12 @@ do_test (int argc, char *argv[])
+ 	      if (strcmp (lname, objs[i].lname) == 0 && lmid == objs[i].lmid)
+ 		{
+ 		  TEST_COMPARE (objs[i].closed, false);
++		  TEST_COMPARE (objs[i].cookie, cookie);
++		  if (objs[i].namespace == -1)
++		    /* No LA_ACT_ADD before the first la_objopen call.  */
++		    TEST_COMPARE (acts[0], last_act_cookie);
++		  else
++		    TEST_COMPARE (objs[i].namespace, last_act_cookie);
+ 		  objs[i].closed = true;
+ 		  break;
+ 		}
+@@ -209,11 +219,7 @@ do_test (int argc, char *argv[])
+ 	  /* la_objclose should be called after la_activity(LA_ACT_DELETE) for
+ 	     the closed object's namespace.  */
+ 	  TEST_COMPARE (last_act, LA_ACT_DELETE);
+-	  if (!seen_first_objclose)
+-	    {
+-	      TEST_COMPARE (last_act_cookie, cookie);
+-	      seen_first_objclose = true;
+-	    }
++	  seen_first_objclose = true;
+ 	}
+     }
+ 
+diff --git a/elf/tst-auditmod28.c b/elf/tst-auditmod28.c
+index db7ba95abe..9e0a122c38 100644
+--- a/elf/tst-auditmod28.c
++++ b/elf/tst-auditmod28.c
+@@ -71,6 +71,17 @@ la_version (unsigned int current)
+   TEST_VERIFY (dladdr1 (&_exit, &info, &extra_info, RTLD_DL_LINKMAP) != 0);
+   TEST_VERIFY (extra_info == handle);
+ 
++  /* Check _dl_find_object.  */
++  struct dl_find_object dlfo;
++  TEST_COMPARE (_dl_find_object (__builtin_return_address (0), &dlfo), 0);
++  /* "ld.so" is seen with --enable-hardcoded-path-in-tests.  */
++  if (strcmp (basename (dlfo.dlfo_link_map->l_name), "ld.so") != 0)
++    TEST_COMPARE_STRING (basename (dlfo.dlfo_link_map->l_name), LD_SO);
++  TEST_COMPARE (_dl_find_object (dlsym (handle, "environ"), &dlfo), 0);
++  TEST_COMPARE_STRING (basename (dlfo.dlfo_link_map->l_name), LIBC_SO);
++  TEST_COMPARE (_dl_find_object ((void *) 1, &dlfo), -1);
++  TEST_COMPARE (_dl_find_object ((void *) -1, &dlfo), -1);
++
+   /* Verify that dlmopen creates a new namespace.  */
+   void *dlmopen_handle = xdlmopen (LM_ID_NEWLM, LIBC_SO, RTLD_NOW);
+   TEST_VERIFY (dlmopen_handle != handle);
 diff --git a/elf/tst-dlmopen-twice-mod1.c b/elf/tst-dlmopen-twice-mod1.c
 new file mode 100644
 index 0000000000..0eaf04948c
@@ -1620,6 +2471,23 @@ index 0000000000..00b1b93342
 +++ b/include/bits/wchar2-decl.h
 @@ -0,0 +1 @@
 +#include <wcsmbs/bits/wchar2-decl.h>
+diff --git a/include/link.h b/include/link.h
+index 0ac82d7c77..4eb8fe0d96 100644
+--- a/include/link.h
++++ b/include/link.h
+@@ -253,8 +253,10 @@ struct link_map
+     /* Start and finish of memory map for this object.  l_map_start
+        need not be the same as l_addr.  */
+     ElfW(Addr) l_map_start, l_map_end;
+-    /* End of the executable part of the mapping.  */
+-    ElfW(Addr) l_text_end;
++
++    /* Linked list of objects in reverse ELF constructor execution
++       order.  Head of list is stored in _dl_init_called_list.  */
++    struct link_map *l_init_called_next;
+ 
+     /* Default array for 'l_scope'.  */
+     struct r_scope_elem *l_scope_mem[4];
 diff --git a/include/resolv.h b/include/resolv.h
 index 3590b6f496..4dbbac3800 100644
 --- a/include/resolv.h
@@ -1634,19 +2502,36 @@ index 3590b6f496..4dbbac3800 100644
  # endif /* _RESOLV_H_ && !_ISOMAC */
  #endif
 diff --git a/io/Makefile b/io/Makefile
-index b1710407d0..fb363c612c 100644
+index b1710407d0..b896484320 100644
 --- a/io/Makefile
 +++ b/io/Makefile
-@@ -80,7 +80,8 @@ tests		:= test-utime test-stat test-stat2 test-lfs tst-getcwd \
+@@ -59,6 +59,7 @@ routines :=								\
+ 	ftw64-time64							\
+ 	closefrom close_range
+ 
++
+ others		:= pwd
+ test-srcs	:= ftwtest ftwtest-time64
+ tests		:= test-utime test-stat test-stat2 test-lfs tst-getcwd \
+@@ -80,7 +81,9 @@ tests		:= test-utime test-stat test-stat2 test-lfs tst-getcwd \
  		   tst-utimensat \
  		   tst-closefrom \
  		   tst-close_range \
 -		   tst-ftw-bz28126
 +		   tst-ftw-bz28126 \
-+		   tst-fcntl-lock
++		   tst-fcntl-lock \
++		   tst-fcntl-lock-lfs
  
  tests-time64 := \
    tst-fcntl-time64 \
+diff --git a/io/tst-fcntl-lock-lfs.c b/io/tst-fcntl-lock-lfs.c
+new file mode 100644
+index 0000000000..f2a909fb02
+--- /dev/null
++++ b/io/tst-fcntl-lock-lfs.c
+@@ -0,0 +1,2 @@
++#define _FILE_OFFSET_BITS 64
++#include <io/tst-fcntl-lock.c>
 diff --git a/io/tst-fcntl-lock.c b/io/tst-fcntl-lock.c
 new file mode 100644
 index 0000000000..357c4b7b56
@@ -2446,6 +3331,93 @@ index 9becb62033..31c64275f0 100644
      map = __nscd_get_mapping (GETFDHST, "hosts", &__hst_map_handle.mapped);
  
    if (map == NO_MAPPING)
+diff --git a/nss/Makefile b/nss/Makefile
+index a978e3927a..7a52c68791 100644
+--- a/nss/Makefile
++++ b/nss/Makefile
+@@ -81,6 +81,7 @@ tests-container := \
+   tst-nss-test3 \
+   tst-reload1 \
+   tst-reload2 \
++  tst-nss-gai-hv2-canonname \
+ # tests-container
+ 
+ # Tests which need libdl
+@@ -144,7 +145,17 @@ libnss_compat-inhibit-o	= $(filter-out .os,$(object-suffixes))
+ ifeq ($(build-static-nss),yes)
+ tests-static		+= tst-nss-static
+ endif
+-extra-test-objs		+= nss_test1.os nss_test2.os nss_test_errno.os
++extra-test-objs		+= nss_test1.os nss_test2.os nss_test_errno.os \
++			   nss_test_gai_hv2_canonname.os
++
++ifeq ($(run-built-tests),yes)
++ifneq (no,$(PERL))
++tests-special += $(objpfx)mtrace-tst-nss-gai-hv2-canonname.out
++endif
++endif
++
++generated += mtrace-tst-nss-gai-hv2-canonname.out \
++		tst-nss-gai-hv2-canonname.mtrace
+ 
+ include ../Rules
+ 
+@@ -179,12 +190,16 @@ rtld-tests-LDFLAGS += -Wl,--dynamic-list=nss_test.ver
+ libof-nss_test1 = extramodules
+ libof-nss_test2 = extramodules
+ libof-nss_test_errno = extramodules
++libof-nss_test_gai_hv2_canonname = extramodules
+ $(objpfx)/libnss_test1.so: $(objpfx)nss_test1.os $(link-libc-deps)
+ 	$(build-module)
+ $(objpfx)/libnss_test2.so: $(objpfx)nss_test2.os $(link-libc-deps)
+ 	$(build-module)
+ $(objpfx)/libnss_test_errno.so: $(objpfx)nss_test_errno.os $(link-libc-deps)
+ 	$(build-module)
++$(objpfx)/libnss_test_gai_hv2_canonname.so: \
++  $(objpfx)nss_test_gai_hv2_canonname.os $(link-libc-deps)
++	$(build-module)
+ $(objpfx)nss_test2.os : nss_test1.c
+ # Use the nss_files suffix for these objects as well.
+ $(objpfx)/libnss_test1.so$(libnss_files.so-version): $(objpfx)/libnss_test1.so
+@@ -194,10 +209,14 @@ $(objpfx)/libnss_test2.so$(libnss_files.so-version): $(objpfx)/libnss_test2.so
+ $(objpfx)/libnss_test_errno.so$(libnss_files.so-version): \
+   $(objpfx)/libnss_test_errno.so
+ 	$(make-link)
++$(objpfx)/libnss_test_gai_hv2_canonname.so$(libnss_files.so-version): \
++  $(objpfx)/libnss_test_gai_hv2_canonname.so
++	$(make-link)
+ $(patsubst %,$(objpfx)%.out,$(tests) $(tests-container)) : \
+ 	$(objpfx)/libnss_test1.so$(libnss_files.so-version) \
+ 	$(objpfx)/libnss_test2.so$(libnss_files.so-version) \
+-	$(objpfx)/libnss_test_errno.so$(libnss_files.so-version)
++	$(objpfx)/libnss_test_errno.so$(libnss_files.so-version) \
++	$(objpfx)/libnss_test_gai_hv2_canonname.so$(libnss_files.so-version)
+ 
+ ifeq (yes,$(have-thread-library))
+ $(objpfx)tst-cancel-getpwuid_r: $(shared-thread-library)
+@@ -206,6 +225,17 @@ endif
+ $(objpfx)tst-nss-files-alias-leak.out: $(objpfx)/libnss_files.so
+ $(objpfx)tst-nss-files-alias-truncated.out: $(objpfx)/libnss_files.so
+ 
++tst-nss-gai-hv2-canonname-ENV = \
++		MALLOC_TRACE=$(objpfx)tst-nss-gai-hv2-canonname.mtrace \
++		LD_PRELOAD=$(common-objpfx)/malloc/libc_malloc_debug.so
++$(objpfx)mtrace-tst-nss-gai-hv2-canonname.out: \
++  $(objpfx)tst-nss-gai-hv2-canonname.out
++	{ test -r $(objpfx)tst-nss-gai-hv2-canonname.mtrace \
++	|| ( echo "tst-nss-gai-hv2-canonname.mtrace does not exist"; exit 77; ) \
++	&& $(common-objpfx)malloc/mtrace \
++	$(objpfx)tst-nss-gai-hv2-canonname.mtrace; } > $@; \
++	$(evaluate-test)
++
+ # Disable DT_RUNPATH on NSS tests so that the glibc internal NSS
+ # functions can load testing NSS modules via DT_RPATH.
+ LDFLAGS-tst-nss-test1 = -Wl,--disable-new-dtags
+@@ -214,3 +244,4 @@ LDFLAGS-tst-nss-test3 = -Wl,--disable-new-dtags
+ LDFLAGS-tst-nss-test4 = -Wl,--disable-new-dtags
+ LDFLAGS-tst-nss-test5 = -Wl,--disable-new-dtags
+ LDFLAGS-tst-nss-test_errno = -Wl,--disable-new-dtags
++LDFLAGS-tst-nss-test_gai_hv2_canonname = -Wl,--disable-new-dtags
 diff --git a/nss/getent.c b/nss/getent.c
 index 8178b4b470..d2d2524b0c 100644
 --- a/nss/getent.c
@@ -2489,6 +3461,68 @@ index 8178b4b470..d2d2524b0c 100644
      default:
        return ARGP_ERR_UNKNOWN;
      }
+diff --git a/nss/nss_test_gai_hv2_canonname.c b/nss/nss_test_gai_hv2_canonname.c
+new file mode 100644
+index 0000000000..4439c83c9f
+--- /dev/null
++++ b/nss/nss_test_gai_hv2_canonname.c
+@@ -0,0 +1,56 @@
++/* NSS service provider that only provides gethostbyname2_r.
++   Copyright The GNU Toolchain Authors.
++   This file is part of the GNU C Library.
++
++   The GNU C Library is free software; you can redistribute it and/or
++   modify it under the terms of the GNU Lesser General Public
++   License as published by the Free Software Foundation; either
++   version 2.1 of the License, or (at your option) any later version.
++
++   The GNU C Library is distributed in the hope that it will be useful,
++   but WITHOUT ANY WARRANTY; without even the implied warranty of
++   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
++   Lesser General Public License for more details.
++
++   You should have received a copy of the GNU Lesser General Public
++   License along with the GNU C Library; if not, see
++   <https://www.gnu.org/licenses/>.  */
++
++#include <nss.h>
++#include <stdlib.h>
++#include <string.h>
++#include "nss/tst-nss-gai-hv2-canonname.h"
++
++/* Catch misnamed and functions.  */
++#pragma GCC diagnostic error "-Wmissing-prototypes"
++NSS_DECLARE_MODULE_FUNCTIONS (test_gai_hv2_canonname)
++
++extern enum nss_status _nss_files_gethostbyname2_r (const char *, int,
++						    struct hostent *, char *,
++						    size_t, int *, int *);
++
++enum nss_status
++_nss_test_gai_hv2_canonname_gethostbyname2_r (const char *name, int af,
++					      struct hostent *result,
++					      char *buffer, size_t buflen,
++					      int *errnop, int *herrnop)
++{
++  return _nss_files_gethostbyname2_r (name, af, result, buffer, buflen, errnop,
++				      herrnop);
++}
++
++enum nss_status
++_nss_test_gai_hv2_canonname_getcanonname_r (const char *name, char *buffer,
++					    size_t buflen, char **result,
++					    int *errnop, int *h_errnop)
++{
++  /* We expect QUERYNAME, which is a small enough string that it shouldn't fail
++     the test.  */
++  if (memcmp (QUERYNAME, name, sizeof (QUERYNAME))
++      || buflen < sizeof (QUERYNAME))
++    abort ();
++
++  strncpy (buffer, name, buflen);
++  *result = buffer;
++  return NSS_STATUS_SUCCESS;
++}
 diff --git a/nss/tst-nss-files-hosts-long.c b/nss/tst-nss-files-hosts-long.c
 index 3942cf5fca..a7697e3143 100644
 --- a/nss/tst-nss-files-hosts-long.c
@@ -2513,6 +3547,96 @@ index 3942cf5fca..a7697e3143 100644
    if (ret != 0)
      FAIL_EXIT1("ahostsv6 failed");
  
+diff --git a/nss/tst-nss-gai-hv2-canonname.c b/nss/tst-nss-gai-hv2-canonname.c
+new file mode 100644
+index 0000000000..7db53cf09d
+--- /dev/null
++++ b/nss/tst-nss-gai-hv2-canonname.c
+@@ -0,0 +1,66 @@
++/* Test NSS query path for plugins that only implement gethostbyname2
++   (#30843).
++   Copyright The GNU Toolchain Authors.
++   This file is part of the GNU C Library.
++
++   The GNU C Library is free software; you can redistribute it and/or
++   modify it under the terms of the GNU Lesser General Public
++   License as published by the Free Software Foundation; either
++   version 2.1 of the License, or (at your option) any later version.
++
++   The GNU C Library is distributed in the hope that it will be useful,
++   but WITHOUT ANY WARRANTY; without even the implied warranty of
++   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
++   Lesser General Public License for more details.
++
++   You should have received a copy of the GNU Lesser General Public
++   License along with the GNU C Library; if not, see
++   <https://www.gnu.org/licenses/>.  */
++
++#include <nss.h>
++#include <netdb.h>
++#include <stdlib.h>
++#include <string.h>
++#include <mcheck.h>
++#include <support/check.h>
++#include <support/xstdio.h>
++#include "nss/tst-nss-gai-hv2-canonname.h"
++
++#define PREPARE do_prepare
++
++static void do_prepare (int a, char **av)
++{
++  FILE *hosts = xfopen ("/etc/hosts", "w");
++  for (unsigned i = 2; i < 255; i++)
++    {
++      fprintf (hosts, "ff01::ff02:ff03:%u:2\ttest.example.com\n", i);
++      fprintf (hosts, "192.168.0.%u\ttest.example.com\n", i);
++    }
++  xfclose (hosts);
++}
++
++static int
++do_test (void)
++{
++  mtrace ();
++
++  __nss_configure_lookup ("hosts", "test_gai_hv2_canonname");
++
++  struct addrinfo hints = {};
++  struct addrinfo *result = NULL;
++
++  hints.ai_family = AF_INET6;
++  hints.ai_flags = AI_ALL | AI_V4MAPPED | AI_CANONNAME;
++
++  int ret = getaddrinfo (QUERYNAME, NULL, &hints, &result);
++
++  if (ret != 0)
++    FAIL_EXIT1 ("getaddrinfo failed: %s\n", gai_strerror (ret));
++
++  TEST_COMPARE_STRING (result->ai_canonname, QUERYNAME);
++
++  freeaddrinfo(result);
++  return 0;
++}
++
++#include <support/test-driver.c>
+diff --git a/nss/tst-nss-gai-hv2-canonname.h b/nss/tst-nss-gai-hv2-canonname.h
+new file mode 100644
+index 0000000000..14f2a9cb08
+--- /dev/null
++++ b/nss/tst-nss-gai-hv2-canonname.h
+@@ -0,0 +1 @@
++#define QUERYNAME "test.example.com"
+diff --git a/nss/tst-nss-gai-hv2-canonname.root/postclean.req b/nss/tst-nss-gai-hv2-canonname.root/postclean.req
+new file mode 100644
+index 0000000000..e69de29bb2
+diff --git a/nss/tst-nss-gai-hv2-canonname.root/tst-nss-gai-hv2-canonname.script b/nss/tst-nss-gai-hv2-canonname.root/tst-nss-gai-hv2-canonname.script
+new file mode 100644
+index 0000000000..31848b4a28
+--- /dev/null
++++ b/nss/tst-nss-gai-hv2-canonname.root/tst-nss-gai-hv2-canonname.script
+@@ -0,0 +1,2 @@
++cp $B/nss/libnss_test_gai_hv2_canonname.so $L/libnss_test_gai_hv2_canonname.so.2
++su
 diff --git a/nss/tst-reload1.c b/nss/tst-reload1.c
 index fdc5bdd65b..bc32bb132a 100644
 --- a/nss/tst-reload1.c
@@ -2548,7 +3672,7 @@ index fdc5bdd65b..bc32bb132a 100644
  
  static struct hostent host_table_2[] = {
 diff --git a/resolv/Makefile b/resolv/Makefile
-index 5b15321f9b..f8a92c6cff 100644
+index 5b15321f9b..28cedf49ee 100644
 --- a/resolv/Makefile
 +++ b/resolv/Makefile
 @@ -40,12 +40,16 @@ routines := \
@@ -2568,7 +3692,7 @@ index 5b15321f9b..f8a92c6cff 100644
    ns_samename \
    nsap_addr \
    nss_dns_functions \
-@@ -89,9 +93,12 @@ tests += \
+@@ -89,11 +93,15 @@ tests += \
    tst-ns_name_pton \
    tst-res_hconf_reorder \
    tst-res_hnok \
@@ -2580,8 +3704,11 @@ index 5b15321f9b..f8a92c6cff 100644
 +  tst-resolv-invalid-cname \
    tst-resolv-network \
    tst-resolv-noaaaa \
++  tst-resolv-noaaaa-vc \
    tst-resolv-nondecimal \
-@@ -104,6 +111,18 @@ tests += \
+   tst-resolv-res_init-multi \
+   tst-resolv-search \
+@@ -104,6 +112,18 @@ tests += \
  tests-internal += tst-resolv-txnid-collision
  tests-static += tst-resolv-txnid-collision
  
@@ -2600,7 +3727,7 @@ index 5b15321f9b..f8a92c6cff 100644
  # These tests need libdl.
  ifeq (yes,$(build-shared))
  tests += \
-@@ -258,8 +277,10 @@ $(objpfx)tst-resolv-ai_idn.out: $(gen-locales)
+@@ -258,8 +278,10 @@ $(objpfx)tst-resolv-ai_idn.out: $(gen-locales)
  $(objpfx)tst-resolv-ai_idn-latin1.out: $(gen-locales)
  $(objpfx)tst-resolv-ai_idn-nolibidn2.out: \
    $(gen-locales) $(objpfx)tst-no-libidn2.so
@@ -2611,15 +3738,17 @@ index 5b15321f9b..f8a92c6cff 100644
  $(objpfx)tst-resolv-edns: $(objpfx)libresolv.so $(shared-thread-library)
  $(objpfx)tst-resolv-network: $(objpfx)libresolv.so $(shared-thread-library)
  $(objpfx)tst-resolv-res_init: $(objpfx)libresolv.so
-@@ -267,6 +288,8 @@ $(objpfx)tst-resolv-res_init-multi: $(objpfx)libresolv.so \
+@@ -267,7 +289,10 @@ $(objpfx)tst-resolv-res_init-multi: $(objpfx)libresolv.so \
    $(shared-thread-library)
  $(objpfx)tst-resolv-res_init-thread: $(objpfx)libresolv.so \
    $(shared-thread-library)
 +$(objpfx)tst-resolv-invalid-cname: $(objpfx)libresolv.so \
 +  $(shared-thread-library)
  $(objpfx)tst-resolv-noaaaa: $(objpfx)libresolv.so $(shared-thread-library)
++$(objpfx)tst-resolv-noaaaa-vc: $(objpfx)libresolv.so $(shared-thread-library)
  $(objpfx)tst-resolv-nondecimal: $(objpfx)libresolv.so $(shared-thread-library)
  $(objpfx)tst-resolv-qtypes: $(objpfx)libresolv.so $(shared-thread-library)
+ $(objpfx)tst-resolv-rotate: $(objpfx)libresolv.so $(shared-thread-library)
 diff --git a/resolv/README b/resolv/README
 index 514e9bb617..2146bc3b27 100644
 --- a/resolv/README
@@ -3084,7 +4213,7 @@ index 0000000000..9a47d8e97a
 +  return *a == 0 && *b == 0;
 +}
 diff --git a/resolv/nss_dns/dns-host.c b/resolv/nss_dns/dns-host.c
-index 544cffbecd..9fa81f23c8 100644
+index 544cffbecd..227734da5c 100644
 --- a/resolv/nss_dns/dns-host.c
 +++ b/resolv/nss_dns/dns-host.c
 @@ -69,6 +69,7 @@
@@ -3358,7 +4487,7 @@ index 544cffbecd..9fa81f23c8 100644
 -				host_buffer.buf->buf, 2048, NULL,
 -				NULL, NULL, NULL, NULL);
 +				dns_packet_buffer, sizeof (dns_packet_buffer),
-+				NULL, NULL, NULL, NULL, NULL);
++				&alt_dns_packet_buffer, NULL, NULL, NULL, NULL);
        if (n >= 0)
 -	status = gaih_getanswer_noaaaa (host_buffer.buf, n,
 -					name, pat, buffer, buflen,
@@ -6047,6 +7176,141 @@ index 0000000000..05725225af
 +  resolv_response_add_data (b, "", 1);
 +  resolv_response_close_record (b);
 +}
+diff --git a/resolv/tst-resolv-noaaaa-vc.c b/resolv/tst-resolv-noaaaa-vc.c
+new file mode 100644
+index 0000000000..9f5aebd99f
+--- /dev/null
++++ b/resolv/tst-resolv-noaaaa-vc.c
+@@ -0,0 +1,129 @@
++/* Test the RES_NOAAAA resolver option with a large response.
++   Copyright (C) 2022-2023 Free Software Foundation, Inc.
++   This file is part of the GNU C Library.
++
++   The GNU C Library is free software; you can redistribute it and/or
++   modify it under the terms of the GNU Lesser General Public
++   License as published by the Free Software Foundation; either
++   version 2.1 of the License, or (at your option) any later version.
++
++   The GNU C Library is distributed in the hope that it will be useful,
++   but WITHOUT ANY WARRANTY; without even the implied warranty of
++   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
++   Lesser General Public License for more details.
++
++   You should have received a copy of the GNU Lesser General Public
++   License along with the GNU C Library; if not, see
++   <https://www.gnu.org/licenses/>.  */
++
++#include <errno.h>
++#include <netdb.h>
++#include <resolv.h>
++#include <stdbool.h>
++#include <stdlib.h>
++#include <support/check.h>
++#include <support/check_nss.h>
++#include <support/resolv_test.h>
++#include <support/support.h>
++#include <support/xmemstream.h>
++
++/* Used to keep track of the number of queries.  */
++static volatile unsigned int queries;
++
++/* If true, add a large TXT record at the start of the answer section.  */
++static volatile bool stuff_txt;
++
++static void
++response (const struct resolv_response_context *ctx,
++          struct resolv_response_builder *b,
++          const char *qname, uint16_t qclass, uint16_t qtype)
++{
++  /* If not using TCP, just force its use.  */
++  if (!ctx->tcp)
++    {
++      struct resolv_response_flags flags = {.tc = true};
++      resolv_response_init (b, flags);
++      resolv_response_add_question (b, qname, qclass, qtype);
++      return;
++    }
++
++  /* The test needs to send four queries, the first three are used to
++     grow the NSS buffer via the ERANGE handshake.  */
++  ++queries;
++  TEST_VERIFY (queries <= 4);
++
++  /* AAAA queries are supposed to be disabled.  */
++  TEST_COMPARE (qtype, T_A);
++  TEST_COMPARE (qclass, C_IN);
++  TEST_COMPARE_STRING (qname, "example.com");
++
++  struct resolv_response_flags flags = {};
++  resolv_response_init (b, flags);
++  resolv_response_add_question (b, qname, qclass, qtype);
++
++  resolv_response_section (b, ns_s_an);
++
++  if (stuff_txt)
++    {
++      resolv_response_open_record (b, qname, qclass, T_TXT, 60);
++      int zero = 0;
++      for (int i = 0; i <= 15000; ++i)
++        resolv_response_add_data (b, &zero, sizeof (zero));
++      resolv_response_close_record (b);
++    }
++
++  for (int i = 0; i < 200; ++i)
++    {
++      resolv_response_open_record (b, qname, qclass, qtype, 60);
++      char ipv4[4] = {192, 0, 2, i + 1};
++      resolv_response_add_data (b, &ipv4, sizeof (ipv4));
++      resolv_response_close_record (b);
++    }
++}
++
++static int
++do_test (void)
++{
++  struct resolv_test *obj = resolv_test_start
++    ((struct resolv_redirect_config)
++     {
++       .response_callback = response
++     });
++
++  _res.options |= RES_NOAAAA;
++
++  for (int do_stuff_txt = 0; do_stuff_txt < 2; ++do_stuff_txt)
++    {
++      queries = 0;
++      stuff_txt = do_stuff_txt;
++
++      struct addrinfo *ai = NULL;
++      int ret;
++      ret = getaddrinfo ("example.com", "80",
++                         &(struct addrinfo)
++                         {
++                           .ai_family = AF_UNSPEC,
++                           .ai_socktype = SOCK_STREAM,
++                         }, &ai);
++
++      char *expected_result;
++      {
++        struct xmemstream mem;
++        xopen_memstream (&mem);
++        for (int i = 0; i < 200; ++i)
++          fprintf (mem.out, "address: STREAM/TCP 192.0.2.%d 80\n", i + 1);
++        xfclose_memstream (&mem);
++        expected_result = mem.buffer;
++      }
++
++      check_addrinfo ("example.com", ai, ret, expected_result);
++
++      free (expected_result);
++      freeaddrinfo (ai);
++    }
++
++  resolv_test_end (obj);
++  return 0;
++}
++
++#include <support/test-driver.c>
 diff --git a/scripts/dso-ordering-test.py b/scripts/dso-ordering-test.py
 index 2dd6bfda18..b87cf2f809 100644
 --- a/scripts/dso-ordering-test.py
@@ -6803,14 +8067,37 @@ index 909b208578..d66f0b9c45 100644
  	ldp	q2, q3, [x29, #OFFSET_RV + DL_OFFSET_RV_V0 + 32*1]
  	ldp	q4, q5, [x29, #OFFSET_RV + DL_OFFSET_RV_V0 + 32*2]
 diff --git a/sysdeps/generic/ldsodefs.h b/sysdeps/generic/ldsodefs.h
-index 050a3032de..6b256b8388 100644
+index 050a3032de..ab8a7fbf84 100644
 --- a/sysdeps/generic/ldsodefs.h
 +++ b/sysdeps/generic/ldsodefs.h
-@@ -1048,9 +1048,11 @@ extern void _dl_init (struct link_map *main_map, int argc, char **argv,
+@@ -105,6 +105,9 @@ typedef struct link_map *lookup_t;
+    DT_PREINIT_ARRAY.  */
+ typedef void (*dl_init_t) (int, char **, char **);
+ 
++/* Type of a constructor function, in DT_FINI, DT_FINI_ARRAY.  */
++typedef void (*fini_t) (void);
++
+ /* On some architectures a pointer to a function is not just a pointer
+    to the actual code of the function but rather an architecture
+    specific descriptor. */
+@@ -1044,13 +1047,24 @@ extern int _dl_check_map_versions (struct link_map *map, int verbose,
+ extern void _dl_init (struct link_map *main_map, int argc, char **argv,
+ 		      char **env) attribute_hidden;
+ 
++/* List of ELF objects in reverse order of their constructor
++   invocation.  */
++extern struct link_map *_dl_init_called_list attribute_hidden;
++
+ /* Call the finalizer functions of all shared objects whose
     initializer functions have completed.  */
  extern void _dl_fini (void) attribute_hidden;
  
 -/* Sort array MAPS according to dependencies of the contained objects.  */
++/* Invoke the DT_FINI_ARRAY and DT_FINI destructors for MAP, which
++   must be a struct link_map *.  Can be used as an argument to
++   _dl_catch_exception.  */
++void _dl_call_fini (void *map) attribute_hidden;
++
 +/* Sort array MAPS according to dependencies of the contained objects.
 +   If FORCE_FIRST, MAPS[0] keeps its place even if the dependencies
 +   say otherwise.  */
@@ -7096,10 +8383,81 @@ index d3a6837fd2..425f514c5c 100644
  typedef pthread_rwlock_t __libc_rwlock_t;
  
 diff --git a/sysdeps/posix/getaddrinfo.c b/sysdeps/posix/getaddrinfo.c
-index bcff909b2f..5cda9bb072 100644
+index bcff909b2f..f975dcd2bc 100644
 --- a/sysdeps/posix/getaddrinfo.c
 +++ b/sysdeps/posix/getaddrinfo.c
-@@ -540,11 +540,11 @@ get_nscd_addresses (const char *name, const struct addrinfo *req,
+@@ -120,6 +120,7 @@ struct gaih_result
+ {
+   struct gaih_addrtuple *at;
+   char *canon;
++  char *h_name;
+   bool free_at;
+   bool got_ipv6;
+ };
+@@ -165,6 +166,7 @@ gaih_result_reset (struct gaih_result *res)
+   if (res->free_at)
+     free (res->at);
+   free (res->canon);
++  free (res->h_name);
+   memset (res, 0, sizeof (*res));
+ }
+ 
+@@ -203,9 +205,8 @@ gaih_inet_serv (const char *servicename, const struct gaih_typeproto *tp,
+   return 0;
+ }
+ 
+-/* Convert struct hostent to a list of struct gaih_addrtuple objects.  h_name
+-   is not copied, and the struct hostent object must not be deallocated
+-   prematurely.  The new addresses are appended to the tuple array in RES.  */
++/* Convert struct hostent to a list of struct gaih_addrtuple objects.  The new
++   addresses are appended to the tuple array in RES.  */
+ static bool
+ convert_hostent_to_gaih_addrtuple (const struct addrinfo *req, int family,
+ 				   struct hostent *h, struct gaih_result *res)
+@@ -238,6 +239,15 @@ convert_hostent_to_gaih_addrtuple (const struct addrinfo *req, int family,
+   res->at = array;
+   res->free_at = true;
+ 
++  /* Duplicate h_name because it may get reclaimed when the underlying storage
++     is freed.  */
++  if (res->h_name == NULL)
++    {
++      res->h_name = __strdup (h->h_name);
++      if (res->h_name == NULL)
++	return false;
++    }
++
+   /* Update the next pointers on reallocation.  */
+   for (size_t i = 0; i < old; i++)
+     array[i].next = array + i + 1;
+@@ -262,7 +272,6 @@ convert_hostent_to_gaih_addrtuple (const struct addrinfo *req, int family,
+ 	}
+       array[i].next = array + i + 1;
+     }
+-  array[0].name = h->h_name;
+   array[count - 1].next = NULL;
+ 
+   return true;
+@@ -324,15 +333,15 @@ gethosts (nss_gethostbyname3_r fct, int family, const char *name,
+    memory allocation failure.  The returned string is allocated on the
+    heap; the caller has to free it.  */
+ static char *
+-getcanonname (nss_action_list nip, struct gaih_addrtuple *at, const char *name)
++getcanonname (nss_action_list nip, const char *hname, const char *name)
+ {
+   nss_getcanonname_r *cfct = __nss_lookup_function (nip, "getcanonname_r");
+   char *s = (char *) name;
+   if (cfct != NULL)
+     {
+       char buf[256];
+-      if (DL_CALL_FCT (cfct, (at->name ?: name, buf, sizeof (buf),
+-			      &s, &errno, &h_errno)) != NSS_STATUS_SUCCESS)
++      if (DL_CALL_FCT (cfct, (hname ?: name, buf, sizeof (buf), &s, &errno,
++			      &h_errno)) != NSS_STATUS_SUCCESS)
+ 	/* If the canonical name cannot be determined, use the passed
+ 	   string.  */
+ 	s = (char *) name;
+@@ -540,11 +549,11 @@ get_nscd_addresses (const char *name, const struct addrinfo *req,
  	  at[count].addr[2] = htonl (0xffff);
  	}
        else if (req->ai_family == AF_UNSPEC
@@ -7114,6 +8472,26 @@ index bcff909b2f..5cda9bb072 100644
  	    res->got_ipv6 = true;
  	}
        at[count].next = at + count + 1;
+@@ -771,7 +780,7 @@ get_nss_addresses (const char *name, const struct addrinfo *req,
+ 		  if ((req->ai_flags & AI_CANONNAME) != 0
+ 		      && res->canon == NULL)
+ 		    {
+-		      char *canonbuf = getcanonname (nip, res->at, name);
++		      char *canonbuf = getcanonname (nip, res->h_name, name);
+ 		      if (canonbuf == NULL)
+ 			{
+ 			  __resolv_context_put (res_ctx);
+@@ -1187,9 +1196,7 @@ free_and_return:
+   if (malloc_name)
+     free ((char *) name);
+   free (addrmem);
+-  if (res.free_at)
+-    free (res.at);
+-  free (res.canon);
++  gaih_result_reset (&res);
+ 
+   return result;
+ }
 diff --git a/sysdeps/posix/system.c b/sysdeps/posix/system.c
 index 8014f63355..20c9420dd4 100644
 --- a/sysdeps/posix/system.c
@@ -8447,14 +9825,14 @@ index a263d294b1..cf35c8bfc9 100644
  {
    return INLINE_SYSCALL_CALL (getrandom, buf, buflen, flags);
 diff --git a/sysdeps/unix/sysv/linux/powerpc/bits/fcntl.h b/sysdeps/unix/sysv/linux/powerpc/bits/fcntl.h
-index d7cf158b33..49c8fac0fb 100644
+index d7cf158b33..0ca6e69ee9 100644
 --- a/sysdeps/unix/sysv/linux/powerpc/bits/fcntl.h
 +++ b/sysdeps/unix/sysv/linux/powerpc/bits/fcntl.h
 @@ -33,6 +33,12 @@
  # define __O_LARGEFILE	0200000
  #endif
  
-+#if __WORDSIZE == 64
++#if __WORDSIZE == 64 && !defined __USE_FILE_OFFSET64
 +# define F_GETLK	5
 +# define F_SETLK	6
 +# define F_SETLKW	7
@@ -9098,13 +10476,205 @@ index 037af22290..5711d1c312 100644
  
      char *path = xasprintf ("/proc/%d/fd/%d", pid, remote_fd);
 diff --git a/sysdeps/x86/dl-cacheinfo.h b/sysdeps/x86/dl-cacheinfo.h
-index e9f3382108..637b5a022d 100644
+index e9f3382108..d95c1efa2c 100644
 --- a/sysdeps/x86/dl-cacheinfo.h
 +++ b/sysdeps/x86/dl-cacheinfo.h
-@@ -861,6 +861,18 @@ dl_init_cacheinfo (struct cpu_features *cpu_features)
-      share of the cache, it has a substantial risk of negatively
-      impacting the performance of other threads running on the chip. */
-   unsigned long int non_temporal_threshold = shared * 3 / 4;
+@@ -478,7 +478,7 @@ handle_zhaoxin (int name)
+ }
+ 
+ static void
+-get_common_cache_info (long int *shared_ptr, unsigned int *threads_ptr,
++get_common_cache_info (long int *shared_ptr, long int * shared_per_thread_ptr, unsigned int *threads_ptr,
+                 long int core)
+ {
+   unsigned int eax;
+@@ -497,6 +497,7 @@ get_common_cache_info (long int *shared_ptr, unsigned int *threads_ptr,
+   unsigned int family = cpu_features->basic.family;
+   unsigned int model = cpu_features->basic.model;
+   long int shared = *shared_ptr;
++  long int shared_per_thread = *shared_per_thread_ptr;
+   unsigned int threads = *threads_ptr;
+   bool inclusive_cache = true;
+   bool support_count_mask = true;
+@@ -512,6 +513,7 @@ get_common_cache_info (long int *shared_ptr, unsigned int *threads_ptr,
+       /* Try L2 otherwise.  */
+       level  = 2;
+       shared = core;
++      shared_per_thread = core;
+       threads_l2 = 0;
+       threads_l3 = -1;
+     }
+@@ -668,29 +670,27 @@ get_common_cache_info (long int *shared_ptr, unsigned int *threads_ptr,
+         }
+       else
+         {
+-intel_bug_no_cache_info:
+-          /* Assume that all logical threads share the highest cache
+-             level.  */
+-          threads
+-            = ((cpu_features->features[CPUID_INDEX_1].cpuid.ebx >> 16)
+-	       & 0xff);
+-        }
+-
+-        /* Cap usage of highest cache level to the number of supported
+-           threads.  */
+-        if (shared > 0 && threads > 0)
+-          shared /= threads;
++	intel_bug_no_cache_info:
++	  /* Assume that all logical threads share the highest cache
++	     level.  */
++	  threads = ((cpu_features->features[CPUID_INDEX_1].cpuid.ebx >> 16)
++		     & 0xff);
++	}
++      /* Get per-thread size of highest level cache.  */
++      if (shared_per_thread > 0 && threads > 0)
++	shared_per_thread /= threads;
+     }
+ 
+   /* Account for non-inclusive L2 and L3 caches.  */
+   if (!inclusive_cache)
+     {
+-      if (threads_l2 > 0)
+-        core /= threads_l2;
++      long int core_per_thread = threads_l2 > 0 ? (core / threads_l2) : core;
++      shared_per_thread += core_per_thread;
+       shared += core;
+     }
+ 
+   *shared_ptr = shared;
++  *shared_per_thread_ptr = shared_per_thread;
+   *threads_ptr = threads;
+ }
+ 
+@@ -704,6 +704,7 @@ dl_init_cacheinfo (struct cpu_features *cpu_features)
+   int max_cpuid_ex;
+   long int data = -1;
+   long int shared = -1;
++  long int shared_per_thread = -1;
+   long int core = -1;
+   unsigned int threads = 0;
+   unsigned long int level1_icache_size = -1;
+@@ -724,6 +725,7 @@ dl_init_cacheinfo (struct cpu_features *cpu_features)
+       data = handle_intel (_SC_LEVEL1_DCACHE_SIZE, cpu_features);
+       core = handle_intel (_SC_LEVEL2_CACHE_SIZE, cpu_features);
+       shared = handle_intel (_SC_LEVEL3_CACHE_SIZE, cpu_features);
++      shared_per_thread = shared;
+ 
+       level1_icache_size
+ 	= handle_intel (_SC_LEVEL1_ICACHE_SIZE, cpu_features);
+@@ -747,13 +749,14 @@ dl_init_cacheinfo (struct cpu_features *cpu_features)
+       level4_cache_size
+ 	= handle_intel (_SC_LEVEL4_CACHE_SIZE, cpu_features);
+ 
+-      get_common_cache_info (&shared, &threads, core);
++      get_common_cache_info (&shared, &shared_per_thread, &threads, core);
+     }
+   else if (cpu_features->basic.kind == arch_kind_zhaoxin)
+     {
+       data = handle_zhaoxin (_SC_LEVEL1_DCACHE_SIZE);
+       core = handle_zhaoxin (_SC_LEVEL2_CACHE_SIZE);
+       shared = handle_zhaoxin (_SC_LEVEL3_CACHE_SIZE);
++      shared_per_thread = shared;
+ 
+       level1_icache_size = handle_zhaoxin (_SC_LEVEL1_ICACHE_SIZE);
+       level1_icache_linesize = handle_zhaoxin (_SC_LEVEL1_ICACHE_LINESIZE);
+@@ -767,13 +770,14 @@ dl_init_cacheinfo (struct cpu_features *cpu_features)
+       level3_cache_assoc = handle_zhaoxin (_SC_LEVEL3_CACHE_ASSOC);
+       level3_cache_linesize = handle_zhaoxin (_SC_LEVEL3_CACHE_LINESIZE);
+ 
+-      get_common_cache_info (&shared, &threads, core);
++      get_common_cache_info (&shared, &shared_per_thread, &threads, core);
+     }
+   else if (cpu_features->basic.kind == arch_kind_amd)
+     {
+       data  = handle_amd (_SC_LEVEL1_DCACHE_SIZE);
+       core = handle_amd (_SC_LEVEL2_CACHE_SIZE);
+       shared = handle_amd (_SC_LEVEL3_CACHE_SIZE);
++      shared_per_thread = shared;
+ 
+       level1_icache_size = handle_amd (_SC_LEVEL1_ICACHE_SIZE);
+       level1_icache_linesize = handle_amd (_SC_LEVEL1_ICACHE_LINESIZE);
+@@ -791,8 +795,11 @@ dl_init_cacheinfo (struct cpu_features *cpu_features)
+       __cpuid (0x80000000, max_cpuid_ex, ebx, ecx, edx);
+ 
+       if (shared <= 0)
+-	/* No shared L3 cache.  All we have is the L2 cache.  */
+-	shared = core;
++	{
++	  /* No shared L3 cache.  All we have is the L2 cache.  */
++	  shared = core;
++	  shared_per_thread = core;
++	}
+       else
+ 	{
+ 	  /* Figure out the number of logical threads that share L3.  */
+@@ -816,7 +823,7 @@ dl_init_cacheinfo (struct cpu_features *cpu_features)
+ 	  /* Cap usage of highest cache level to the number of
+ 	     supported threads.  */
+ 	  if (threads > 0)
+-	    shared /= threads;
++	    shared_per_thread /= threads;
+ 
+ 	  /* Get shared cache per ccx for Zen architectures.  */
+ 	  if (cpu_features->basic.family >= 0x17)
+@@ -827,12 +834,13 @@ dl_init_cacheinfo (struct cpu_features *cpu_features)
+ 	      __cpuid_count (0x8000001D, 0x3, eax, ebx, ecx, edx);
+ 
+ 	      unsigned int threads_per_ccx = ((eax >> 14) & 0xfff) + 1;
+-	      shared *= threads_per_ccx;
++	      shared_per_thread *= threads_per_ccx;
+ 	    }
+ 	  else
+ 	    {
+ 	      /* Account for exclusive L2 and L3 caches.  */
+ 	      shared += core;
++	      shared_per_thread += core;
+             }
+ 	}
+     }
+@@ -850,17 +858,46 @@ dl_init_cacheinfo (struct cpu_features *cpu_features)
+   cpu_features->level3_cache_linesize = level3_cache_linesize;
+   cpu_features->level4_cache_size = level4_cache_size;
+ 
+-  /* The default setting for the non_temporal threshold is 3/4 of one
+-     thread's share of the chip's cache. For most Intel and AMD processors
+-     with an initial release date between 2017 and 2020, a thread's typical
+-     share of the cache is from 500 KBytes to 2 MBytes. Using the 3/4
+-     threshold leaves 125 KBytes to 500 KBytes of the thread's data
+-     in cache after a maximum temporal copy, which will maintain
+-     in cache a reasonable portion of the thread's stack and other
+-     active data. If the threshold is set higher than one thread's
+-     share of the cache, it has a substantial risk of negatively
+-     impacting the performance of other threads running on the chip. */
+-  unsigned long int non_temporal_threshold = shared * 3 / 4;
++  /* The default setting for the non_temporal threshold is 1/4 of size
++     of the chip's cache. For most Intel and AMD processors with an
++     initial release date between 2017 and 2023, a thread's typical
++     share of the cache is from 18-64MB. Using the 1/4 L3 is meant to
++     estimate the point where non-temporal stores begin out-competing
++     REP MOVSB. As well the point where the fact that non-temporal
++     stores are forced back to main memory would already occurred to the
++     majority of the lines in the copy. Note, concerns about the
++     entire L3 cache being evicted by the copy are mostly alleviated
++     by the fact that modern HW detects streaming patterns and
++     provides proper LRU hints so that the maximum thrashing
++     capped at 1/associativity. */
++  unsigned long int non_temporal_threshold = shared / 4;
++
++  /* If the computed non_temporal_threshold <= 3/4 * per-thread L3, we most
++     likely have incorrect/incomplete cache info in which case, default to
++     3/4 * per-thread L3 to avoid regressions.  */
++  unsigned long int non_temporal_threshold_lowbound
++      = shared_per_thread * 3 / 4;
++  if (non_temporal_threshold < non_temporal_threshold_lowbound)
++    non_temporal_threshold = non_temporal_threshold_lowbound;
++
++  /* If no ERMS, we use the per-thread L3 chunking. Normal cacheable stores run
++     a higher risk of actually thrashing the cache as they don't have a HW LRU
++     hint. As well, their performance in highly parallel situations is
++     noticeably worse.  */
++  if (!CPU_FEATURE_USABLE_P (cpu_features, ERMS))
++    non_temporal_threshold = non_temporal_threshold_lowbound;
 +  /* SIZE_MAX >> 4 because memmove-vec-unaligned-erms right-shifts the value of
 +     'x86_non_temporal_threshold' by `LOG_4X_MEMCPY_THRESH` (4) and it is best
 +     if that operation cannot overflow. Minimum of 0x4040 (16448) because the
@@ -9120,7 +10690,7 @@ index e9f3382108..637b5a022d 100644
  
  #if HAVE_TUNABLES
    /* NB: The REP MOVSB threshold must be greater than VEC_SIZE * 8.  */
-@@ -915,8 +927,8 @@ dl_init_cacheinfo (struct cpu_features *cpu_features)
+@@ -915,8 +952,8 @@ dl_init_cacheinfo (struct cpu_features *cpu_features)
      shared = tunable_size;
  
    tunable_size = TUNABLE_GET (x86_non_temporal_threshold, long int, NULL);
@@ -9131,7 +10701,7 @@ index e9f3382108..637b5a022d 100644
      non_temporal_threshold = tunable_size;
  
    tunable_size = TUNABLE_GET (x86_rep_movsb_threshold, long int, NULL);
-@@ -931,14 +943,9 @@ dl_init_cacheinfo (struct cpu_features *cpu_features)
+@@ -931,14 +968,9 @@ dl_init_cacheinfo (struct cpu_features *cpu_features)
  
    TUNABLE_SET_WITH_BOUNDS (x86_data_cache_size, data, 0, SIZE_MAX);
    TUNABLE_SET_WITH_BOUNDS (x86_shared_cache_size, shared, 0, SIZE_MAX);

Reply to: