On August 18, 2004 at 2:57PM +0900, miles (at lsi.nec.co.jp) wrote: > Package: gawk > Version: 1:3.1.4-1 > Executing the following line in a shell: > > echo -e '--- orig/lisp/ChangeLog\n+++ mod/lisp/ChangeLog' | LANG=ja_JP gawk '/[Cc]hangeLog/ { print }' > > yields not the expected two lines of output, but instead only the first one: > > --- orig/lisp/ChangeLog > > > If the LANG-setting portion is changed to use C, then it works as > expected (others such as "de" seem to work too): > > echo -e '--- orig/lisp/ChangeLog\n+++ mod/lisp/ChangeLog' | LANG=C gawk '/[Cc]hangeLog/ { print }' > > yields: > > --- orig/lisp/ChangeLog > +++ mod/lisp/ChangeLog > > > I'm not sure if the actual encoding has any impact -- ja_JP, ja_JP.utf8, > and ja_JP.eucjp all exhibit the same problem. ko_KR, zh_CN, and zh_TW exhibit the same problem. On CJK locales, this bug causes gawk scripts unusable. Downgrading gawk to version 1:3.1.3-3 prevents the problem. Could anyone fix this bug? Thanks, -- Tatsuya Kinoshita
Attachment:
pgpeE8QKS9TGm.pgp
Description: PGP signature