Re: match across line using grep
On 2010-08-03 09:57 -0700, Bob McGowan wrote:
> On 08/03/2010 05:39 AM, Andre Majorel wrote:
> > On 2010-08-03 19:37 +0800, Zhang Weiwu wrote:
> >> On 2010???08???03??? 17:53, Andre Majorel wrote:
> >>>>> $ printf 'a\nb' | grep -zo a.*b
> >>>>>
> >>>>> (The above should output something /if/ -z would make egrep
> >>>>> not consider \n as string terminator. But it has produced no
> >>>>> output)
> >>>>
> >>> But grep -z does. This would seem to be an undocumented
> >>> limitation of -o.
> >>
> >> No it doesn't.
> >>
> >> $ printf 'a\nb' | grep -z 'a.*b'
> >> $
> >
> > You're welcome. What version of grep ?
>
> The -z "sort of" does/doesn't work for me. If I do this:
>
> $ perl -e 'print "a\nb\0"'| grep -z 'a.*b'
> $
$ printf 'a\nb\0'| grep -z 'a.*b'
a
b$ grep --version
GNU grep 2.5.3
Fun, eh ? Maybe the answer is in there :
$ locale
LANG=
LC_CTYPE=en_US
LC_NUMERIC="POSIX"
LC_TIME="POSIX"
LC_COLLATE=C
LC_MONETARY="POSIX"
LC_MESSAGES="POSIX"
LC_PAPER="POSIX"
LC_NAME="POSIX"
LC_ADDRESS="POSIX"
LC_TELEPHONE="POSIX"
LC_MEASUREMENT="POSIX"
LC_IDENTIFICATION="POSIX"
LC_ALL=
> There's no output. But change it like this:
>
> $ perl -e 'print "a\nb\0"'| grep -z 'a'
> a
> b$
>
> It found, and printed, the newline containing string. I would suspect
> the regex engine is still honoring '. (dot) does not match newline'
> convention but is OK with literals, if present.
My grep -z acts like it used a regexp engine where "." matches
newline. Only when -o is in effect and there is a newline in the
match, there's no output. But the exit status is still good :
$ printf 'a\nb\0'| (grep -z 'a.*b' && printf 'st=%d chars=' $? >&2) | wc -c
st=0 chars=4
$ printf 'a\nb\0'| (grep -oz 'a.*b' && printf 'st=%d chars=' $? >&2) | wc -c
st=0 chars=0
--
André Majorel <http://www.teaser.fr/~amajorel/>
No one ever sends you any email ? Report a bug in Debian !
Reply to: