[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#157086: marked as done (libc6: mbrtowc bug with incomplete wide characters)



Your message dated Sat, 28 Jul 2007 19:08:43 -0400
with message-id <20070728230843.GB26046@caradoc.them.org>
and subject line Bug#157086: status?
has caused the attached Bug report to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what I am
talking about this indicates a serious mail system misconfiguration
somewhere.  Please contact me immediately.)

Debian bug tracking system administrator
(administrator, Debian Bugs database)

--- Begin Message ---
Package: libc6
Version: 2.2.5-13
Severity: normal
Tags: upstream

Some characters (in the Thai character?) set can not be resumed if they are
partially parsed.  The problem can be reproduced by (LC_ALL set to
en_US.UTF-8):

char *str4 = "\xe0\xb8\xb1";
int bar(char *str)
{ 
  mbstate_t ps;
  wchar_t wc;
  int j;
  memset (&ps, 0, sizeof(ps));
  ps.__value.__wch = 3584;
  j = mbrtowc (&wc, str, 1, &ps);
  j = mbrtowc (&wc, str+1, 2, &ps);
  return j;
}
int main(int argc, char **argv, char **env)
{ 
  setlocale(LC_ALL, "");

  bar(str4);
}

The character parses correctly from that shift state if the whole string is
given at once:

(gdb) p ps
$8 = {__count = 0, __value = {__wch = 3584, __wchb = "\0\016\0"}}
(gdb) p mbrtowc(&wc, str, 3, &ps)
$9 = 3
(gdb) p ps
$10 = {__count = 0, __value = {__wch = 3584, __wchb = "\0\016\0"}}
(gdb) p mbrtowc(&wc, str, 1, &ps)
$11 = -2
(gdb) p mbrtowc(&wc, str+1, 2, &ps)
$12 = -1


I don't know the sequence to reach that shift state, but it's in M. Kuhn's
UTF-8-demo.txt file, a standard UTF-8 test.

-- System Information:
Debian Release: testing/unstable
Architecture: i386
Kernel: Linux nevyn 2.4.19-pre10-ac2-drow #4 SMP Sun Jun 16 12:01:20 EDT 2002 i686
Locale: LANG=en_US, LC_CTYPE=

-- no debconf information



--- End Message ---
--- Begin Message ---
On Sat, Jul 28, 2007 at 07:21:02PM +0300, Touko Korpela wrote:
> Should this old glibc bug closed or is it still relevant?

Seems fixed.

-- 
Daniel Jacobowitz
CodeSourcery

--- End Message ---

Reply to: