[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#874160: marked as done (C.UTF-8 locales should be regarded like C w.r.t. $LANGUAGE precedence)



Your message dated Fri, 08 Sep 2023 20:50:10 +0000
with message-id <E1qeiQc-00DiYD-Ud@fasolo.debian.org>
and subject line Bug#874160: fixed in glibc 2.37-8
has caused the Debian Bug report #874160,
regarding C.UTF-8 locales should be regarded like C w.r.t. $LANGUAGE precedence
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@bugs.debian.org
immediately.)


-- 
874160: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=874160
Debian Bug Tracking System
Contact owner@bugs.debian.org with problems
--- Begin Message ---
Package: src:glibc
Version: 2.24-17
Severity: wishlist
Tags: patch

Hi!
Here's a simple patch set to change the default of setlocale(…, "") to
C.UTF-8.  This is a drastically smaller change than altering the meaning of
"C" to mean "C.UTF-8" that upstream is mulling over -- it affects only
programs that already have locale support, when the user fails to set any.

If none of LC_ALL, LANG nor LC_CTYPE are set, instead of taking this to mean
"C" we assume "C.UTF-8".  This is explicitely allowed by POSIX (an
"implementation-defined default locale").  setlocale(…, "C") or not calling
it at all retain the old meaning[1].

This is the approach already taken by musl.

I'm not submitting this upstream first as C.UTF-8 is still a Debian-specific
thing.

The improvement would be: if for any reason the user fails to set the
locale, a daemon's startup script is too eager clearing its environment,
a build chroot fails to inherit env vars, etc -- in all of these cases we'll
fall back to an UTF-8 locale.  Making a locale-aware program use "C" is
still fully possible via setting LC_ALL=C but we won't suffer from non-UTF8
by omission.


This is mostly an one-line patch (1/3), the other two update the testsuite
(2/3) and alter hard-coded output of /usr/bin/locale (3/3).


Meow!

[1]. Making "C" behave like "C.UTF-8" would be, according to my reading,
compliant with both POSIX-2008@2016 and C11 except for a minor iswblank()
weirdness, but this is not a part of this change.
-- System Information:
Debian Release: buster/sid
  APT prefers unstable-debug
  APT policy: (500, 'unstable-debug'), (500, 'unstable'), (500, 'testing'), (150, 'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 4.13.0-rc7-debug-ubsan-00220-g92222baeac7d (SMP w/6 CPU cores)
Locale: LANG=C.UTF-8, LC_CTYPE=C.UTF-8 (charmap=UTF-8), LANGUAGE=C.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: sysvinit (via /sbin/init)
>From 92d9938c6ba813afaf854d7bc12a9dc0c71371c3 Mon Sep 17 00:00:00 2001
From: Adam Borowski <kilobyte@angband.pl>
Date: Sun, 3 Sep 2017 00:26:47 +0200
Subject: [PATCH 1/3] Default to C.UTF-8 on setlocale(..., "") if no env vars
 are set.

This doesn't affects programs that are not prepared to handle arbitrary
locales as those either don't call setlocale() at all or use setlocale(...,
"C"); merely programs which would have used a proper locale had the user
set it up.

This provides a decent default when env var configuration is missing, in a
way that's more robust than mucking with login defs and daemon startup
scripts.

A default locale other than "C" is allowed by POSIX; also at least musl
uses an equivalent of C.UTF-8 already.
---
 locale/findlocale.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/locale/findlocale.c b/locale/findlocale.c
index 4cb9d5ea8a..2a12b4e808 100644
--- a/locale/findlocale.c
+++ b/locale/findlocale.c
@@ -123,8 +123,12 @@ _nl_find_locale (const char *locale_path, size_t locale_path_len,
 			    + _nl_category_name_idxs[category]);
       if (!name_present (cloc_name))
 	cloc_name = getenv ("LANG");
+      /* If no env vars are set, we're free to choose an
+         "implementation-defined default locale":
+         http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_08_02
+      */
       if (!name_present (cloc_name))
-	cloc_name = _nl_C_name;
+	cloc_name = "C.UTF-8";
     }
 
   /* We used to fall back to the C locale if the name contains a slash
-- 
2.14.1

>From 612dc7f67f93882b7acb2f035b1cc200ceb2e153 Mon Sep 17 00:00:00 2001
From: Adam Borowski <kilobyte@angband.pl>
Date: Sun, 3 Sep 2017 03:43:10 +0200
Subject: [PATCH 2/3] Adjust the setlocale test suite for C.UTF-8 as default.

---
 localedata/bug-setlocale1.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/localedata/bug-setlocale1.c b/localedata/bug-setlocale1.c
index 546ea7beb8..2c86e2361d 100644
--- a/localedata/bug-setlocale1.c
+++ b/localedata/bug-setlocale1.c
@@ -39,9 +39,9 @@ do_test (void)
   if (d == NULL)
     return 1;
 
-  if (strcmp (d, "C") != 0)
+  if (strcmp (d, "C.UTF-8") != 0)
     {
-      puts ("*** LC_NUMERIC not C");
+      puts ("*** LC_NUMERIC not C.UTF-8");
       result = 1;
     }
 
-- 
2.14.1

>From fb6cc4a418c6278dfc2dcf45bc1ea40e06ef9caf Mon Sep 17 00:00:00 2001
From: Adam Borowski <kilobyte@angband.pl>
Date: Sun, 3 Sep 2017 13:43:41 +0200
Subject: [PATCH 3/3] Change hard-coded value for "no defined vars" in
 /usr/bin/locale.

---
 locale/programs/locale.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/locale/programs/locale.c b/locale/programs/locale.c
index 9da3e5319f..131472766c 100644
--- a/locale/programs/locale.c
+++ b/locale/programs/locale.c
@@ -819,7 +819,7 @@ show_locale_vars (void)
 	  print_assignment (name,
 			    lcall[0] != '\0' ? lcall
 			    : lang[0] != '\0' ? lang
-			    : "POSIX",
+			    : "C.UTF-8",
 			    true);
 	else
 	  print_assignment (name, val, false);
-- 
2.14.1


--- End Message ---
--- Begin Message ---
Source: glibc
Source-Version: 2.37-8
Done: Aurelien Jarno <aurel32@debian.org>

We believe that the bug you reported is fixed in the latest version of
glibc, which is due to be installed in the Debian FTP archive.

A summary of the changes between this version and the previous one is
attached.

Thank you for reporting the bug, which will now be closed.  If you
have further comments please address them to 874160@bugs.debian.org,
and the maintainer will reopen the bug report if appropriate.

Debian distribution maintenance software
pp.
Aurelien Jarno <aurel32@debian.org> (supplier of updated glibc package)

(This message was generated automatically at their request; if you
believe that there is a problem with it please contact the archive
administrators by mailing ftpmaster@ftp-master.debian.org)


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Format: 1.8
Date: Fri, 08 Sep 2023 20:39:29 +0200
Source: glibc
Architecture: source
Version: 2.37-8
Distribution: unstable
Urgency: medium
Maintainer: GNU Libc Maintainers <debian-glibc@lists.debian.org>
Changed-By: Aurelien Jarno <aurel32@debian.org>
Closes: 874160 1050592
Changes:
 glibc (2.37-8) unstable; urgency=medium
 .
   [ Samuel Thibault ]
   * debian/libc0.3.symbols.hurd-i386: Update symbols.
   * debian/patches/hurd-i386/git-jemalloc.diff: Add support for static TSD
     data.
   * debian/patches/hurd-i386/git-jemalloc2.diff: Initialize ___pthread_self
     early.
   * debian/patches/hurd-i386/git-error_t.diff: Make error_t an int on C++.
   * debian/patches/hurd-i386/git-tls_dtors.diff: Fix TLS destructors.
   * debian/patches/hurd-i386/git-main_stack.diff: Fix stack information for main
     thread.
 .
   [ Aurelien Jarno ]
   * debian/patches/local-disable-tst-bz29951.diff: removed, obsolete.
   * debian/patches/any/git-c-utf-8-language.diff: backport support from
     upstream to treat C.<encoding> locale like C locale.  Closes: #874160.
   * debian/patches/git-updates.diff: update from upstream stable branch:
     - Fix the value of F_GETLK/F_SETLK/F_SETLKW with __USE_FILE_OFFSET64 on
       ppc64el.  Closes: #1050592.
     - debian/patches/hurd-i386/git-exception-long.diff: upstreamed.
Checksums-Sha1:
 281c5bcc99ab244917948931b1fcea2542cc713c 8959 glibc_2.37-8.dsc
 b6974d62862ea092732c0ff897265d2a5f6944bb 399620 glibc_2.37-8.debian.tar.xz
 91fdbff276dc463895ac7a2f1f579a0df384e412 9641 glibc_2.37-8_source.buildinfo
Checksums-Sha256:
 f5dcb3ed9d8a6a1bc207c5c1f5f4c64b0550fc5f5c5e0eac947e3c3eaea7b6e9 8959 glibc_2.37-8.dsc
 107e483c57ab96d13f2b705d10daf86efca8fd9585737af5413babbfa9a2e258 399620 glibc_2.37-8.debian.tar.xz
 950a0889d30edd24f7f1b4443c239af95ab4fe5d648d1740dafbc2e38a1f3b37 9641 glibc_2.37-8_source.buildinfo
Files:
 1608b13380b10b27932786bf713c1d35 8959 libs required glibc_2.37-8.dsc
 9a7049b772ed9a634aa49ddb20d891fd 399620 libs required glibc_2.37-8.debian.tar.xz
 09fa37476a29a4241b49c0c4f5896a2b 9641 libs required glibc_2.37-8_source.buildinfo

-----BEGIN PGP SIGNATURE-----

iQIzBAEBCgAdFiEEUryGlb40+QrX1Ay4E4jA+JnoM2sFAmT7hYgACgkQE4jA+Jno
M2seDQ//SkMioOndYvmLqsLBUddaKPquF4POn4LQVxWyY2Zz0TFs1X3C1axfEAQX
Dx9QAItx+/2TausxZYvMIBz8BIfr/LXIbQ7ORSjOi+olT5IAP6Gmh+8pCnv5IvES
9tFL+0KpedABBmCsKppMgg6sJZAv5dBVqrIYy7pAGfkUdIA5nMMHO4GYgX9kWDf7
F05yRT3eAioM6bjrz+WXlMmX0DY7D7wtERG+t0ky0OEtsaLonQqN/2sqKrtO66Gl
n72IsL826RXLKJwA0jBmwTssntiLrpM5egWOWQfXPmx9v4BoH7hx/fElw9PUHhEJ
pNMyfhfCo+SvlC/8acmlarswDgTC75PNzf5MrbWioMoVEALTsz2b43TjRzyFijWD
QwcuFbMJLBIiXED70w2a8r+9EB/+vXdq5gLDKZs9j7yf8t6LYrHvvOlJwbyQ7Ihl
VPhYd6OTgwCUF9i9q+XFlnfyl8Ayq/yw6YAwsxLpnAVCv1W3OU1vR1UAauQ5vDA1
1CLaSXR4V0goSuCssz1ChtBVXJjqFbjPv0q2WCRcpjiprJoSc+G9p9BeeCieZdkM
h+l2iubjmTVY94eW0xLXebC0O5jiHiDpwXiXgwyx3codPJoSzcgQ42QZWvElf/8L
JK4Oh5AeTKYVxTcHPPeItaSkEq+kyFzBIhbXzuCPYVTdvWpGYV0=
=XSxH
-----END PGP SIGNATURE-----

--- End Message ---

Reply to: