[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#265163: marked as done (locales: locale.alias aliases some names to unsupported locales)



Your message dated Wed, 11 Aug 2004 22:26:46 -0500
with message-id <20040812032646.GB28552@redwald.deadbeast.net>
and subject line closing
has caused the attached Bug report to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what I am
talking about this indicates a serious mail system misconfiguration
somewhere.  Please contact me immediately.)

Debian bug tracking system administrator
(administrator, Debian Bugs database)

--------------------------------------
Received: (at submit) by bugs.debian.org; 12 Aug 2004 00:22:37 +0000
>From branden@redwald.deadbeast.net Wed Aug 11 17:22:37 2004
Return-path: <branden@redwald.deadbeast.net>
Received: from dhcp065-026-182-085.indy.rr.com (sisyphus.deadbeast.net) [65.26.182.85] 
	by spohr.debian.org with esmtp (Exim 3.35 1 (Debian))
	id 1Bv3Ma-0000dD-00; Wed, 11 Aug 2004 17:22:36 -0700
Received: by sisyphus.deadbeast.net (Postfix, from userid 1000)
	id 7762568C015; Wed, 11 Aug 2004 19:22:35 -0500 (EST)
Content-Type: multipart/mixed; boundary="===============1778653036=="
MIME-Version: 1.0
From: Branden Robinson <branden@debian.org>
To: Debian Bug Tracking System <submit@bugs.debian.org>
Subject: locales: locale.alias aliases some names to unsupported locales
X-Mailer: reportbug 2.64
Date: Wed, 11 Aug 2004 19:22:35 -0500
Message-Id: <[🔎] 20040812002235.7762568C015@sisyphus.deadbeast.net>
Delivered-To: submit@bugs.debian.org
X-Spam-Checker-Version: SpamAssassin 2.60-bugs.debian.org_2004_03_25 
	(1.212-2003-09-23-exp) on spohr.debian.org
X-Spam-Status: No, hits=-8.0 required=4.0 tests=BAYES_00,HAS_PACKAGE 
	autolearn=no version=2.60-bugs.debian.org_2004_03_25
X-Spam-Level: 

This is a multi-part MIME message sent by reportbug.

--===============1778653036==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

Package: locales
Version: 2.3.2.ds1-15
Severity: normal
Tags: upstream

Some of the locale aliases in /etc/locale.alias map names to unsupported
locales.  Namely, "eucJP" and "eucKR" aren't spelled correctly per
/usr/share/i18n/SUPPORTED, and the "SJIS" codeset isn't supported at all.

I'm attaching two files:

* A Python script I wrote that found this problem.
* A patch to correct the problem.  I corrected all but one problem; I had
  to drop the alias for "japanese.sjis", as adding support for the SJIS
  character set to glibc is beyond my ability, and I don't even know if
  that's a desirable solution.

Thanks for looking into this.

-- System Information:
Debian Release: 3.1
  APT prefers unstable
  APT policy: (500, 'unstable')
Architecture: powerpc (ppc)
Kernel: Linux 2.4.25-powerpc-smp
Locale: LANG=C, LC_CTYPE=en_US.UTF-8

Versions of packages locales depends on:
ii  debconf                     1.4.30       Debian configuration management sy
ii  libc6 [glibc-2.3.2.ds1-15]  2.3.2.ds1-15 GNU C Library: Shared libraries an

-- debconf information:
* locales/default_environment_locale: None
* locales/locales_to_be_generated: en_US ISO-8859-1, en_US.ISO-8859-15 ISO-8859-15, en_US.UTF-8 UTF-8

--===============1778653036==
Content-Type: text/x-java; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="glibc_locale_audit"

#!/usr/bin/python

import os
import re

RUNTIME_DEBUG = True

# Build a dictionary of canonical locales according to the GNU C library.  The
# keys in this dictionary are the locale names, and the values are the character
# sets used by each locale name.

glibc_locales_canonical = { }
glibc_locale_file = open(os.path.join("/", "usr", "share", "i18n", "SUPPORTED"))

for line in glibc_locale_file.readlines():
    (left_side, right_side) = re.split(r'\s', line, 1)
    glibc_locales_canonical[(left_side.strip())] = right_side.strip()

glibc_locale_file.close()

if RUNTIME_DEBUG:
    print "Canonical glibc locales: %s" % (glibc_locales_canonical.keys(),)

glibc_locales_aliased = { }
glibc_alias_file = open(os.path.join("/", "etc", "locale.alias"))

for line in glibc_alias_file.readlines():
    # Ignore blank lines and lines beginning with a comment character.
    # beginning with "XCOMM".
    if re.match(r'$', line) \
      or re.match(r'#', line):
        continue
    (left_side, right_side) = re.split(r'\s', line, 1)
    glibc_locales_aliased[(left_side.strip())] = right_side.strip()
    # glibc is a little weird; it aliases names to locale specifications
    # *including* the codeset, whereas it omits the codeset from the officially
    # supported list except when necessary for disambiguation purposes.
    # Consequently, if we don't find the alias's target in the canonical list,
    # we have to fall back to seeing if it is in the canonical list using the
    # same codeset that is explicitly stated.
    if right_side.strip() not in glibc_locales_canonical.keys():
        # Try harder to find it.
        goal_locale = right_side.strip()
        found = False
        for locale in glibc_locales_canonical.keys():
            if not re.match(r'\.', locale):
                locale_with_codeset = '.'.join([ locale,
                                               glibc_locales_canonical[locale] ])
                if goal_locale == locale_with_codeset:
                    found = True
                    break
        if not found:
            print "Warning: glibc bug: glibc locale %s is aliased to" \
              " non-canonical glibc locale %s" \
              % (left_side.strip(), right_side.strip())

glibc_alias_file.close()

if RUNTIME_DEBUG:
    print "Aliased glibc locales: %s" % (glibc_locales_aliased.keys(),)

# vim:set ai et sts=4 sw=4 tw=80:

--===============1778653036==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="glibc_locale_alias.diff"

--- /etc/locale.alias.dpkg-dist	2004-08-11 19:15:44.000000000 -0500
+++ /etc/locale.alias	2004-08-11 19:17:57.000000000 -0500
@@ -49,14 +49,13 @@
 hungarian       hu_HU.ISO-8859-2
 icelandic       is_IS.ISO-8859-1
 italian         it_IT.ISO-8859-1
-japanese	ja_JP.eucJP
-japanese.euc	ja_JP.eucJP
-ja_JP		ja_JP.eucJP
-ja_JP.ujis	ja_JP.eucJP
-japanese.sjis	ja_JP.SJIS
-korean		ko_KR.eucKR
-korean.euc 	ko_KR.eucKR
-ko_KR		ko_KR.eucKR
+japanese	ja_JP.EUC-JP
+japanese.euc	ja_JP.EUC-JP
+ja_JP		ja_JP.EUC-JP
+ja_JP.ujis	ja_JP.EUC-JP
+korean		ko_KR.EUC-KR
+korean.euc 	ko_KR.EUC-KR
+ko_KR		ko_KR.EUC-KR
 lithuanian      lt_LT.ISO-8859-13
 norwegian       no_NO.ISO-8859-1
 nynorsk		nn_NO.ISO-8859-1

--===============1778653036==--

---------------------------------------
Received: (at 265163-done) by bugs.debian.org; 12 Aug 2004 03:26:47 +0000
>From branden@redwald.deadbeast.net Wed Aug 11 20:26:47 2004
Return-path: <branden@redwald.deadbeast.net>
Received: from dhcp065-026-182-085.indy.rr.com (sisyphus.deadbeast.net) [65.26.182.85] 
	by spohr.debian.org with esmtp (Exim 3.35 1 (Debian))
	id 1Bv6Ep-0003mi-00; Wed, 11 Aug 2004 20:26:47 -0700
Received: by sisyphus.deadbeast.net (Postfix, from userid 1000)
	id 5EFE168C015; Wed, 11 Aug 2004 22:26:46 -0500 (EST)
Date: Wed, 11 Aug 2004 22:26:46 -0500
From: Branden Robinson <branden@debian.org>
To: 265163-done@bugs.debian.org
Subject: closing
Message-ID: <20040812032646.GB28552@redwald.deadbeast.net>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature"; boundary="A6N2fC+uXW/VQSAv"
Content-Disposition: inline
User-Agent: Mutt/1.5.6+20040803i
Delivered-To: 265163-done@bugs.debian.org
X-Spam-Checker-Version: SpamAssassin 2.60-bugs.debian.org_2004_03_25 
	(1.212-2003-09-23-exp) on spohr.debian.org
X-Spam-Status: No, hits=-2.0 required=4.0 tests=BAYES_01 autolearn=no 
	version=2.60-bugs.debian.org_2004_03_25
X-Spam-Level: 


--A6N2fC+uXW/VQSAv
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Closing this bug because the maintainer A) didn't understand what I was
getting at and B) doesn't see a problem anyway.

*shrug*

--=20
G. Branden Robinson                |    It was a typical net.exercise -- a
Debian GNU/Linux                   |    screaming mob pounding on a greasy
branden@debian.org                 |    spot on the pavement, where used to
http://people.debian.org/~branden/ |    lie the carcass of a dead horse.

--A6N2fC+uXW/VQSAv
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)

iEYEARECAAYFAkEa43YACgkQ6kxmHytGonxWQQCfWqMoE5WYn+3Ya7aboQG1Mbyn
ex0AoJRftarzvqEZ10KDSxd+0XcObosr
=md+3
-----END PGP SIGNATURE-----

--A6N2fC+uXW/VQSAv--



Reply to: