[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1037198: locales: please parallelise locale-gen



Package: locales
Version: 2.36-9
Severity: wishlist
Tags: patch

Dear Maintainer,

Posting as a bug per comment from Andrej; originally posted 2022-05-06 as
  https://salsa.debian.org/glibc-team/glibc/-/merge_requests/7

Patch based on current Salsa HEAD attached, incl. analysis.

Best,
наб

-- System Information:
Debian Release: 12.0
  APT prefers unstable
  APT policy: (500, 'unstable')
Architecture: x32 (x86_64)
Foreign Architectures: amd64, i386

Kernel: Linux 6.1.0-2-amd64 (SMP w/2 CPU threads; PREEMPT)
Kernel taint flags: TAINT_PROPRIETARY_MODULE, TAINT_OOT_MODULE, TAINT_UNSIGNED_MODULE
Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages locales depends on:
ii  debconf [debconf-2.0]  1.5.82
ii  libc-bin               2.36-9
ii  libc-l10n              2.36-9

locales recommends no packages.

locales suggests no packages.

-- debconf information:
* locales/locales_to_be_generated: en_GB.UTF-8 UTF-8
* locales/default_environment_locale: en_GB.UTF-8
From b6af0ad83f5517fd1987f9c7ac0493565bc0976d Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?=D0=BD=D0=B0=D0=B1?= <nabijaczleweli@nabijaczleweli.xyz>
Date: Fri, 6 May 2022 01:22:10 +0200
Subject: [PATCH] Parallelise locale-gen if possible
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-Mutt-PGP: OS

Assuming a very generous 200M free/localedef (because I saw a max RSS
of 147M w/time(1)), this will attempt to keep all jobs saturated,
and usually succeed. There's little starvation, since the vast majority
of time is spent in gzip(1) ‒ 1:14 user vs 27:55 sys

At 2.2ish seconds per locale, even on a low-end system of today with
4 CPUs (and 800 free MB), we can generate up to 4 locales at once
for 6.6s' speed-up. Assuming no super-pathological cases, this globally
scales in roughly ceil(locales/ncpus)*2.2s chunks, which is a massive
win

The only user-visible change is that, with nproc>1, the output is
  en_GB.UTF-8...
  <cursor here>
instead of
  en_GB.UTF-8... <cursor here, will print "done\n" when it's done>

MemFree: in /proc/meminfo is available on all supported Debian kernels,
and, indeed, exactly what procps free(1) uses
---
 debian/local/usr_sbin/locale-gen | 30 ++++++++++++++++++++++++++++--
 1 file changed, 28 insertions(+), 2 deletions(-)

diff --git a/debian/local/usr_sbin/locale-gen b/debian/local/usr_sbin/locale-gen
index 7fa3d772..f1632f4e 100755
--- a/debian/local/usr_sbin/locale-gen
+++ b/debian/local/usr_sbin/locale-gen
@@ -23,6 +23,18 @@ is_entry_ok() {
 	fi
 }
 
+nproc="$(nproc 2>/dev/null)" || nproc=1
+if [ "$nproc" -gt 1 ]; then
+	mem_free=0
+	while read -r k v _; do
+		[ "$k" = "MemFree:" ] && mem_free="$v" && break || :
+	done < /proc/meminfo || :
+	mem_free=$(( mem_free / 1024 / 200 ))
+	[ "$mem_free" -lt 1 ] && mem_free=1 || :
+	[ "$mem_free" -lt "$nproc" ] && nproc="$mem_free" || :
+	jobs=0; pids=
+fi 2>/dev/null
+
 echo "Generating locales (this might take a while)..."
 while read -r locale charset; do
 	if [ -z "$locale" ] || [ "${locale#\#}" != "$locale" ]; then continue; fi
@@ -35,6 +47,7 @@ while read -r locale charset; do
 	locale_at="${locale#*@}"
 	[ "$locale_at" = "$locale" ] && locale_at= || locale_at="@$locale_at"
 	printf "  %s.%s%s..." "$locale_base" "$charset" "$locale_at"
+	[ "$nproc" -gt 1 ] && echo || :
 
 	if [ -e "$USER_LOCALES/$locale" ]; then
 		input="$USER_LOCALES/$locale"
@@ -46,7 +59,20 @@ while read -r locale charset; do
 			input="$USER_LOCALES/$input"
 		fi
 	fi
-	localedef -i "$input" -c -f "$charset" -A /usr/share/locale/locale.alias "$locale" || :
-	echo " done"
+	localedef -i "$input" -c -f "$charset" -A /usr/share/locale/locale.alias "$locale" &
+	if [ "$nproc" -gt 1 ]; then
+		pids="$pids$! "
+		jobs=$(( jobs + 1 ))
+
+		if [ "$jobs" -ge "$nproc" ]; then
+			wait "${pids%% *}" || :
+			jobs=$(( jobs - 1 ))
+			pids="${pids#* }"
+		fi
+	else
+		wait
+		echo " done"
+	fi
 done < "$LOCALEGEN"
+wait
 echo "Generation complete."
-- 
2.30.2

Attachment: signature.asc
Description: PGP signature


Reply to: