Your message dated Sun, 22 Apr 2007 16:53:22 +0200 with message-id <20070422145322.GA11465@volta.aurel32.net> and subject line Bug#216512: workaround for libc crashes on incomplete multibyte chars has caused the attached Bug report to be marked as done. This means that you claim that the problem has been dealt with. If this is not the case it is now your responsibility to reopen the Bug report if necessary, and/or fix the problem forthwith. (NB: If you are a system administrator and have no idea what I am talking about this indicates a serious mail system misconfiguration somewhere. Please contact me immediately.) Debian bug tracking system administrator (administrator, Debian Bugs database)
--- Begin Message ---
- To: Eric Agnew <agnew@geekhive.net>, submit@bugs.debian.org
- Cc: bug-gnu-utils@gnu.org
- Subject: libc6: regex (re_exec) segfault in UTF-8 locale [Re: grep 2.5.1 segfault, and (more) color patch (again)
- From: Jim Meyering <jim@meyering.net>
- Date: Mon, 07 Apr 2003 14:58:20 +0200
- Message-id: <85u1dao45v.fsf@pi.meyering.net>
- In-reply-to: <20030407055622.GE23827@goku.geekhive.net> (Eric Agnew's message of "Sun, 6 Apr 2003 22:56:23 -0700")
- References: <20030407055622.GE23827@goku.geekhive.net>
Package: libc6 Version: 2.3.1-16 Severity: normal Eric Agnew <agnew@geekhive.net> wrote: > First, a bug report: I'm getting a segfault on grep 2.5.1 when grepping > the edict file ( http://ftp.cc.monash.edu.au/pub/nihongo/edict.gz ): > > egrep '^(.)(.)(.)\1\2\3 ' edict > or: > grep '^\(.\)\(.\)\(.\)\1\2\3 ' edict > > both output 13 lines and the seg fault. strace didn't seem to tell me > anything, and I've never been able to figure out gdb, so.. hopefully > someone will be able to reproduce it.. For reference, I'm running > Linux (debian/unstable) on x86. Thanks for the report. Note that to reproduce the failure you probably have to be using a UTF-8 locale. The system I used happened to have fr_FR.UTF-8 installed, so I used that, even though the data in that file is in Japanese. On a system with x86 Linux debian/unstable (grep-2.5.1-4 and libc6-2.3.1-16), I pared it down to this: $ printf pMik3KTIpNwK | recode /64 \ | LC_ALL=fr_FR.UTF-8 /bin/grep -nE '^(.)(.)(.)\1\2\3 ' Segmentation fault [Exit 139 (SIGSEGV)] This also does it: $ grep totteringly edict|LC_ALL=fr_FR.UTF-8 /bin/grep -nE '^(.)(.)(.)\1\2\3 ' Segmentation fault [Exit 139 (SIGSEGV)] It looks like a problem in libc's re_exec function: $ LC_ALL=fr_FR.UTF-8 gdb /bin/grep (gdb) r -E '^(.)(.)(.)\1\2\3 ' k Starting program: /bin/grep -E '^(.)(.)(.)\1\2\3 ' k (no debugging symbols found)...(no debugging symbols found)... Program received signal SIGSEGV, Segmentation fault. 0x400c9ad5 in re_exec () from /lib/libc.so.6 (gdb) But note that if you rebuild grep by running `configure --with-included-regex' the resulting binary doesn't segfault. It doesn't find any matches, either. The same thing happens if I link grep with the very latest regex code from glibc's CVS repository.Attachment: pgp_NvTJjakyX.pgp
Description: PGP signature
--- End Message ---
--- Begin Message ---
- To: 216512-done@bugs.debian.org
- Subject: Bug#216512: workaround for libc crashes on incomplete multibyte chars
- From: Aurelien Jarno <aurelien@aurel32.net>
- Date: Sun, 22 Apr 2007 16:53:22 +0200
- Message-id: <20070422145322.GA11465@volta.aurel32.net>
- In-reply-to: <20040407145450.GA25487@konishi>
- References: <20040407145450.GA25487@konishi>
Version: 2.3.6.ds1-13 I am able to reproduce the problem with sarge's glibc, but not with the etch one. I think the bug is fixed, and I am closing it with this mail. -- .''`. Aurelien Jarno | GPG: 1024D/F1BCDB73 : :' : Debian developer | Electrical Engineer `. `' aurel32@debian.org | aurelien@aurel32.net `- people.debian.org/~aurel32 | www.aurel32.net
--- End Message ---