Bug#157086: marked as done (libc6: mbrtowc bug with incomplete wide characters)

To: Daniel Jacobowitz <drow@false.org>
Subject: Bug#157086: marked as done (libc6: mbrtowc bug with incomplete wide characters)
From: owner@bugs.debian.org (Debian Bug Tracking System)
Date: Sat, 28 Jul 2007 23:12:06 +0000
Message-id: <[🔎] handler.157086.D157086.118566412626479.ackdone@bugs.debian.org>
References: <20070728230843.GB26046@caradoc.them.org> <E17gA9W-0002kl-00@nevyn.them.org>

Your message dated Sat, 28 Jul 2007 19:08:43 -0400
with message-id <20070728230843.GB26046@caradoc.them.org>
and subject line Bug#157086: status?
has caused the attached Bug report to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what I am
talking about this indicates a serious mail system misconfiguration
somewhere.  Please contact me immediately.)

Debian bug tracking system administrator
(administrator, Debian Bugs database)

--- Begin Message ---

To: "Debian Bug Tracking System" <submit@bugs.debian.org>
Subject: libc6: mbrtowc bug with incomplete wide characters
From: "Daniel Jacobowitz" <dan@debian.org>
Date: Sat, 17 Aug 2002 16:26:29 -0400
Message-id: <E17gA9W-0002kl-00@nevyn.them.org>

Package: libc6
Version: 2.2.5-13
Severity: normal
Tags: upstream

Some characters (in the Thai character?) set can not be resumed if they are
partially parsed.  The problem can be reproduced by (LC_ALL set to
en_US.UTF-8):

char *str4 = "\xe0\xb8\xb1";
int bar(char *str)
{ 
  mbstate_t ps;
  wchar_t wc;
  int j;
  memset (&ps, 0, sizeof(ps));
  ps.__value.__wch = 3584;
  j = mbrtowc (&wc, str, 1, &ps);
  j = mbrtowc (&wc, str+1, 2, &ps);
  return j;
}
int main(int argc, char **argv, char **env)
{ 
  setlocale(LC_ALL, "");

  bar(str4);
}

The character parses correctly from that shift state if the whole string is
given at once:

(gdb) p ps
$8 = {__count = 0, __value = {__wch = 3584, __wchb = "\0\016\0"}}
(gdb) p mbrtowc(&wc, str, 3, &ps)
$9 = 3
(gdb) p ps
$10 = {__count = 0, __value = {__wch = 3584, __wchb = "\0\016\0"}}
(gdb) p mbrtowc(&wc, str, 1, &ps)
$11 = -2
(gdb) p mbrtowc(&wc, str+1, 2, &ps)
$12 = -1


I don't know the sequence to reach that shift state, but it's in M. Kuhn's
UTF-8-demo.txt file, a standard UTF-8 test.

-- System Information:
Debian Release: testing/unstable
Architecture: i386
Kernel: Linux nevyn 2.4.19-pre10-ac2-drow #4 SMP Sun Jun 16 12:01:20 EDT 2002 i686
Locale: LANG=en_US, LC_CTYPE=

-- no debconf information

--- End Message ---

--- Begin Message ---

To: Touko Korpela <tkorpela@phnet.fi>, 157086-done@bugs.debian.org

Subject: Re: Bug#157086: status?

From: Daniel Jacobowitz <drow@false.org>

Date: Sat, 28 Jul 2007 19:08:43 -0400

Message-id: <20070728230843.GB26046@caradoc.them.org>

In-reply-to: <20070728162102.GA5928@tiikeri.vuoristo.local>

References: <20070728162102.GA5928@tiikeri.vuoristo.local>
On Sat, Jul 28, 2007 at 07:21:02PM +0300, Touko Korpela wrote:
> Should this old glibc bug closed or is it still relevant?

Seems fixed.

-- 
Daniel Jacobowitz
CodeSourcery
--- End Message ---

Reply to:

Prev by Date: r2471 - in glibc-package/trunk/debian: . patches patches/any patches/hppa
Next by Date: r2472 - in glibc-package/trunk/debian: . patches patches/any
Previous by thread: r2471 - in glibc-package/trunk/debian: . patches patches/any patches/hppa
Next by thread: r2472 - in glibc-package/trunk/debian: . patches patches/any
Index(es):
- Date
- Thread