[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: match across line using grep



On 2010-08-03 09:57 -0700, Bob McGowan wrote:
> On 08/03/2010 05:39 AM, Andre Majorel wrote:
> > On 2010-08-03 19:37 +0800, Zhang Weiwu wrote:
> >> On 2010???08???03??? 17:53, Andre Majorel wrote:
> >>>>> $ printf 'a\nb' | grep -zo a.*b
> >>>>>
> >>>>> (The above should output something /if/ -z would make egrep
> >>>>> not consider \n as string terminator. But it has produced no
> >>>>> output)
> >>>>     
> >>> But grep -z does. This would seem to be an undocumented
> >>> limitation of -o.
> >>
> >> No it doesn't.
> >>
> >> $ printf 'a\nb' | grep -z 'a.*b'
> >> $
> > 
> > You're welcome. What version of grep ?
> 
> The -z "sort of" does/doesn't work for me.  If I do this:
> 
> $ perl -e 'print "a\nb\0"'| grep -z 'a.*b'
> $

  $ printf 'a\nb\0'| grep -z 'a.*b'
  a
  b$ grep --version
  GNU grep 2.5.3

Fun, eh ? Maybe the answer is in there :

  $ locale
  LANG=
  LC_CTYPE=en_US
  LC_NUMERIC="POSIX"
  LC_TIME="POSIX"
  LC_COLLATE=C
  LC_MONETARY="POSIX"
  LC_MESSAGES="POSIX"
  LC_PAPER="POSIX"
  LC_NAME="POSIX"
  LC_ADDRESS="POSIX"
  LC_TELEPHONE="POSIX"
  LC_MEASUREMENT="POSIX"
  LC_IDENTIFICATION="POSIX"
  LC_ALL=

> There's no output.  But change it like this:
> 
> $ perl -e 'print "a\nb\0"'| grep -z 'a'
> a
> b$
> 
> It found, and printed, the newline containing string.  I would suspect
> the regex engine is still honoring '. (dot) does not match newline'
> convention but is OK with literals, if present.

My grep -z acts like it used a regexp engine where "." matches
newline. Only when -o is in effect and there is a newline in the
match, there's no output. But the exit status is still good :

  $ printf 'a\nb\0'| (grep -z 'a.*b' && printf 'st=%d chars=' $? >&2) | wc -c
  st=0 chars=4
  $ printf 'a\nb\0'| (grep -oz 'a.*b' && printf 'st=%d chars=' $? >&2) | wc -c
  st=0 chars=0

-- 
André Majorel <http://www.teaser.fr/~amajorel/>
No one ever sends you any email ? Report a bug in Debian !


Reply to: