[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#525299: iconv accepts UTF-8-encoded UTF-16 surrogates



Package: libc6
Version: 2.9-7
Severity: normal

$ man utf-8 | grep -A 2 UTF-16 | sed -e 's/^ *//'
The UCS code values 0xd800–0xdfff (UTF-16 surrogates) as well as 0xfffe
and 0xffff (UCS non-characters) should not appear in  conforming  UTF-8
streams.

$ s='\xed\xa0\x88\xed\xbd\x85' # 0xd808 + 0xdf45
$ for e in UTF-8 UTF-16 UTF-32 UCS-4
do
  printf "$e\t"
  printf $s | iconv -f UTF-8 -t $e > /dev/null && printf 'OK\n'
done
UTF-8	OK
UTF-16	iconv: illegal input sequence at position 0
UTF-32	iconv: illegal input sequence at position 0
UCS-4	OK

-- System Information:
Debian Release: squeeze/sid
  APT prefers unstable
  APT policy: (900, 'unstable'), (500, 'experimental')
Architecture: i386 (i686)

Kernel: Linux 2.6.26-1-686 (SMP w/2 CPU cores)
Locale: LANG=C, LC_CTYPE=pl_PL.utf8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages libc6 depends on:
ii  libgcc1                       1:4.3.3-8  GCC support library

libc6 recommends no packages.

Versions of packages libc6 suggests:
ii  glibc-doc                     2.9-7      GNU C Library: Documentation
ii  libc6-i686                    2.9-7      GNU C Library: Shared libraries [i
ii  locales                       2.9-7      GNU C Library: National Language (

-- debconf information:
  glibc/upgrade: true
  glibc/disable-screensaver:
  glibc/restart-failed:
* glibc/restart-services:

--
Jakub Wilk



Reply to: