Package: libc6 Version: 2.3.1-16 Severity: normal Eric Agnew <agnew@geekhive.net> wrote: > First, a bug report: I'm getting a segfault on grep 2.5.1 when grepping > the edict file ( http://ftp.cc.monash.edu.au/pub/nihongo/edict.gz ): > > egrep '^(.)(.)(.)\1\2\3 ' edict > or: > grep '^\(.\)\(.\)\(.\)\1\2\3 ' edict > > both output 13 lines and the seg fault. strace didn't seem to tell me > anything, and I've never been able to figure out gdb, so.. hopefully > someone will be able to reproduce it.. For reference, I'm running > Linux (debian/unstable) on x86. Thanks for the report. Note that to reproduce the failure you probably have to be using a UTF-8 locale. The system I used happened to have fr_FR.UTF-8 installed, so I used that, even though the data in that file is in Japanese. On a system with x86 Linux debian/unstable (grep-2.5.1-4 and libc6-2.3.1-16), I pared it down to this: $ printf pMik3KTIpNwK | recode /64 \ | LC_ALL=fr_FR.UTF-8 /bin/grep -nE '^(.)(.)(.)\1\2\3 ' Segmentation fault [Exit 139 (SIGSEGV)] This also does it: $ grep totteringly edict|LC_ALL=fr_FR.UTF-8 /bin/grep -nE '^(.)(.)(.)\1\2\3 ' Segmentation fault [Exit 139 (SIGSEGV)] It looks like a problem in libc's re_exec function: $ LC_ALL=fr_FR.UTF-8 gdb /bin/grep (gdb) r -E '^(.)(.)(.)\1\2\3 ' k Starting program: /bin/grep -E '^(.)(.)(.)\1\2\3 ' k (no debugging symbols found)...(no debugging symbols found)... Program received signal SIGSEGV, Segmentation fault. 0x400c9ad5 in re_exec () from /lib/libc.so.6 (gdb) But note that if you rebuild grep by running `configure --with-included-regex' the resulting binary doesn't segfault. It doesn't find any matches, either. The same thing happens if I link grep with the very latest regex code from glibc's CVS repository.
Attachment:
pgpjP0yAK3hob.pgp
Description: PGP signature