[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#854821: marked as done (iconv: behavior change with C.UTF-8)



Your message dated Thu, 8 Sep 2022 23:48:50 +0200
with message-id <YxpjQtWgIUETNCtl@aurel32.net>
and subject line Re: Transliteration in C.UTF-8 locales
has caused the Debian Bug report #854821,
regarding iconv: behavior change with C.UTF-8
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@bugs.debian.org
immediately.)


-- 
854821: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=854821
Debian Bug Tracking System
Contact owner@bugs.debian.org with problems
--- Begin Message ---
Package: libc-bin
Version: 2.24-9
Severity: normal

Dear Maintainer,

I am trying to track down the root cause of a FTBFS in phpmyadmin (e.g,
https://launchpadlibrarian.net/301371947/buildlog_ubuntu-zesty-amd64.phpmyadmin_4%3A4.6.5.2-1_BUILDING.txt.gz,
which is due to a testcase failure at build-time:

"iconv(): Detected an illegal character in input string"

The test in question is basically doing:

$ echo "This is the Euro symbol '€'" |iconv -f UTF-8 -t ISO-8859-1//TRANSLIT

Since the builders default to C.UTF-8, if one prefaces this with

$ export LC_ALL=C.UTF-8

in various environments, we get:

Yakkety (libc.bin == 2.24-3ubuntu2) produces:
This is the Euro symbol 'EUR'

Zesty (libc.bin == 2.24-7ubuntu2) produces:
This is the Euro symbol 'iconv: illegal input sequence at position 25

Stretch & Sid (libc.bin == 2.24.9) produce:
This is the Euro symbol 'iconv: illegal input sequence at position 25

Given that phpmyadmin did build in Sid (earlier), I'm guessing that on
the next rebuild of phpmyadmin, it will fail in the same way as Ubuntu.

If the LC_ALL is set to POSIX or en_US.UTF-8 or C, the testcase passes
in all environments. I am not sure if this is due to the change back to
combining for transliteration in C.UTF-8, the update to Unicode 9, or a
combination of the two, but I think this behavior change was unintended?

The following is from my reporting system (running Ubuntu), but I am
able to reproduce the issue in a Sid schroot, as mentioned.

-- System Information:
Debian Release: stretch/sid
  APT prefers yakkety-updates
  APT policy: (500, 'yakkety-updates'), (500, 'yakkety-security'), (500, 'yakkety'), (100, 'yakkety-backports')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 4.8.0-37-generic (SMP w/4 CPU cores)
Locale: LANG=en_CA.UTF-8, LC_CTYPE=en_CA.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)

Versions of packages libc-bin depends on:
ii  libc6  2.24-3ubuntu2

libc-bin recommends no packages.

Versions of packages libc-bin suggests:
ii  manpages  4.07-1

-- no debconf information

-- 
Nishanth Aravamudan
Ubuntu Server
Canonical Ltd

--- End Message ---
--- Begin Message ---
Version: 2.31-5

Hi,

On 2017-03-31 21:05, Michal Čihař wrote:
> Hi
> 
> I was just forced to look at this again (see #859219) and I think the
> transliteration is not working as it should.
> 
> What is actually reason to make it behave differently on C.UTF-8 than
> on other UTF-8 locales? Does it really have to be that either
> transliteration of "ç" is broken or transliteration of "€" is broken
> for this locale?
> 
> In most other UTF-8 locales (if not all, I've not tested this) both of
> them work just fine:
> 
> $ echo "ça va €" | LC_ALL=en_GB.UTF-8  iconv -f UTF-8 -t
> "ascii//TRANSLIT"
> ca va EUR
> $ echo "ça va €" | LC_ALL=de_DE.UTF-8  iconv -f UTF-8 -t
> "ascii//TRANSLIT"
> ca va EUR
> $ echo "ça va €" | LC_ALL=cs_CZ.UTF-8  iconv -f UTF-8 -t
> "ascii//TRANSLIT"
> ca va EUR
> $ echo "ça va €" | LC_ALL=C.UTF-8  iconv -f UTF-8 -t "ascii//TRANSLIT"
> ca va iconv: illegal input sequence at position 7

This has been fixed in glibc 2.31-5. Closing the bug accordingly.

Regards
Aurelien

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net

Attachment: signature.asc
Description: PGP signature


--- End Message ---

Reply to: