On August 18, 2004 at 2:57PM +0900,
miles (at lsi.nec.co.jp) wrote:
> Package: gawk
> Version: 1:3.1.4-1
> Executing the following line in a shell:
>
> echo -e '--- orig/lisp/ChangeLog\n+++ mod/lisp/ChangeLog' | LANG=ja_JP gawk '/[Cc]hangeLog/ { print }'
>
> yields not the expected two lines of output, but instead only the first one:
>
> --- orig/lisp/ChangeLog
>
>
> If the LANG-setting portion is changed to use C, then it works as
> expected (others such as "de" seem to work too):
>
> echo -e '--- orig/lisp/ChangeLog\n+++ mod/lisp/ChangeLog' | LANG=C gawk '/[Cc]hangeLog/ { print }'
>
> yields:
>
> --- orig/lisp/ChangeLog
> +++ mod/lisp/ChangeLog
>
>
> I'm not sure if the actual encoding has any impact -- ja_JP, ja_JP.utf8,
> and ja_JP.eucjp all exhibit the same problem.
ko_KR, zh_CN, and zh_TW exhibit the same problem. On CJK
locales, this bug causes gawk scripts unusable.
Downgrading gawk to version 1:3.1.3-3 prevents the problem.
Could anyone fix this bug?
Thanks,
--
Tatsuya Kinoshita
Attachment:
pgpl7SVTzqIu4.pgp
Description: PGP signature