Bug#588990: libc-bin: iconv -l doesn't indicate aliases

To: 588990@bugs.debian.org
Subject: Bug#588990: libc-bin: iconv -l doesn't indicate aliases
From: Neil Mayhew <neil_mayhew@users.sourceforge.net>
Date: Fri, 30 Jul 2010 18:14:59 -0600
Message-id: <[🔎] 4C536B03.6090405@users.sourceforge.net>
Reply-to: Neil Mayhew <neil_mayhew@users.sourceforge.net>, 588990@bugs.debian.org
In-reply-to: <[🔎] 20100727031558.GA9338@volta.aurel32.net>
References: <[🔎] 20100714041833.11375.44402.reportbug@localhost> <[🔎] 20100714095356.GM18814@hall.aurel32.net> <[🔎] 20100727031558.GA9338@volta.aurel32.net>

 On 2010-07-26 9:15 PM Aurelien Jarno wrote:

On Wed, Jul 14, 2010 at 11:53:56AM +0200, Aurelien Jarno wrote:
You have to be more specific about the problem, I don't see anychange between glibc based version and eglibc based version beside afew more supported encoding.
glibc and eglibc don't differ on the iconv code.

I checked, and it seems I was getting confused between GNU libiconv<http://www.gnu.org/software/libiconv/> and the glibc/eglibcimplementation of iconv.


GNU libiconv outputs the following from iconv -l, for example:

ISO-10646-UCS-2 UCS-2 CSUNICODE
UCS-2BE UNICODE-1-1 UNICODEBIG CSUNICODE11
UCS-2LE UNICODELITTLE
ISO-10646-UCS-4 UCS-4 CSUCS4

This makes it clear which names are equivalents. The glibc/eglibc iconvjust outputs these on separate lines. If it were possible to provide thelibiconv functionality, maybe using an additional option to iconv, thatwould be helpful.

The bigger issue, however, is that glibc's iconv doesn't document whatthe various encoding names mean, *anywhere*. Something like CP1149 canbe Googled and found in places like Wikipedia, but a name like "UNICODE"is very ambiguous, and odd names like "CSUNICODE" don't return anythingvery obvious in Google searches. In fact, the best description I foundwas in the documentation for an entirely different library, recode<http://www.delorie.com/gnu/docs/recode/recode_30.html>. I think(e)glibc should do its own documentation and not rely on other sources.

The GNU libiconv is slightly better, because output from iconv -lexplains what CSUNICODE means by showing that it's the same as awell-defined, unambiguous encoding (ISO-10646-UCS-2).

However, neither library explains byte order anywhere. I can get BE orLE by specifying it explicitly in the encoding name, but typically Ineed to get native and I don't want to have to do a runtime test forendianness and then add it to the encoding name. How was I supposed toknow that UCS-2 means "native byte order" rather than some canonicalordering such as big? Different iconv implementations actually differ onthis. On Mac OS X on Intel with either the system iconv and the MacPortsversion of GNU libiconv, UCS-2 actually means big-endian:


$ echo -ne '\xe2\x80\xa2' | iconv -f utf-8 -t ucs-2 | xxd
0000000: 2022

Running the same on Linux returns:
0000000: 2220

So if it's interpreted differently by different libraries, even thoughthey all implement the same standard, shouldn't the behaviour on Linuxbe documented somewhere?

Any news about that?

Sorry for the delay. My email address forwards to gmail, which put bothof your messages in the spam folder :-( Normally, gmail's spam detectionis excellent so I don't bother to check it very often.


--Neil

Reply to:

Follow-Ups:
- Bug#588990: libc-bin: iconv -l doesn't indicate aliases
  - From: Neil Mayhew <neil_mayhew@users.sourceforge.net>

References:
- Bug#588990: libc-bin: iconv -l doesn't indicate aliases
  - From: Neil Mayhew <neil_mayhew@users.sourceforge.net>
- Bug#588990: libc-bin: iconv -l doesn't indicate aliases
  - From: Aurelien Jarno <aurelien@aurel32.net>
- Bug#588990: libc-bin: iconv -l doesn't indicate aliases
  - From: Aurelien Jarno <aurelien@aurel32.net>

Prev by Date: r4367 - in glibc-package/trunk/debian: . patches/kfreebsd
Next by Date: Bug#588990: libc-bin: iconv -l doesn't indicate aliases
Previous by thread: Bug#588990: libc-bin: iconv -l doesn't indicate aliases
Next by thread: Bug#588990: libc-bin: iconv -l doesn't indicate aliases
Index(es):
- Date
- Thread