[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#395817: iconv fails for non-ASCII characters in (seamengly all) ISO-8859-# charsets



Package: libc6
Version: 2.3.2.ds1-22
Severity: major

Iconv fails to convert ISO-8859-1 apostrophe to UTF-8.  It works OK
if input encoding is specified as cp1251.

ISO-8859-1 is perhaps the most widespread single-byte encoding; iconv
_must_ work with it.  Therefore I set severity to `major'.

To check:

$ iconv --from iso-8859-1 --to utf-8 iso-8859-1-test
$ iconv --from cp1251     --to utf-8 iso-8859-1-test

Or:

$ iconv --from iso-8859-1 --to utf-8 iso-8859-1-test | wc -c
$ iconv --from cp1251     --to utf-8 iso-8859-1-test | wc -c

(I get 81 and 82 correspondingly.  Apostrophe in ISO-8859-1 is not
touched and remains as invalid UTF-8 sequence.  In cp1251 it is
converted to a valid UTF-8 apostrophe, as expected.)

Apostrophe is considered an illegal character in ISO-8859-1.  The
behavior can also be seen in gedit and Emacs.  Other ISO-8859-#
charsets show the same behavior.

Kernel version: 2.6.8-3-k7

Test case is attached.

Because he?s already safe in the corner, Black can continue to build influence.

Reply to: