[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: which command I should use to extract the matching part out



lina wrote:
> Bob Proulx wrote:
> > lina wrote:
> >> aaa
> >> model 0
> >> bbb
> >> ddd
> >> model 1
> >> ccc
> >>
> >> I want to print out the parts which match the "model 0" and ends with
> >> match "model 1"
> >>
> >> for the final expected output is:
> >>
> >> model 0
> >> bbb
> >> ddd
> >
> > Try this:
> >
> >  sed -n '/^model 1/q;/^model 0/,$p'
> 
> Just realize the sed -n '/model 0/,/model 1/'p can also do that. (so
> newbie I was/am).

Not quite.  You said you wanted the output to include the model 0 line
but NOT the model 1 line.  Doing /model 0/,/model 1/p prints both of
the model lines including the model 1 line.  The way I did it won't.

> just still don't understand above sentence. sed -n '/^model 1/q;/^model 0/,$p'

By default sed will print lines at the end of the processing loop.
The -n option tells sed not to print lines by default.  Otherwise
every line would be printed.  Then for the lines you want to print
there is the explicit 'p' option to print that line.[1]

Using /model 0/,/model 1/ means print the lines starting on the first
pattern and ending on the second pattern.  This prints the lines
between along with the starting and ending lines.

To avoid printing the ending pattern I did it differently.  Let's look
at the second part first.  /^model 0/,$p says to print from the model
line all of the way to the end of file.  The end of file is designated
by the '$' which is a line number and $ means the last line in the
file when sed reads EOF from the file.  I don't know how long the file
will be so just print all lines starting with /^model 0/.

Then to stop the printing where I want I use /^model 1/q which tells
sed that when it sees the model 1 line that it should quit (exit) and
when it exits it will of course no longer be printing.

The exit clause must come before the printing clause so that it won't
print the line that you didn't want printed.  If they were in the
other order then it would print the closing pattern line and then
execute the exit clause.

Normally sed will recycle the pattern.  If you have multiple strings
then it will reset and print the next set.

  $ printf "a\nb\nCCC\nd\ne\na\nb\nCCC\nd\ne\n"
  a
  b
  CCC
  d
  e
  a
  b
  CCC
  d
  e

  $ printf "a\nb\nCCC\nd\ne\na\nb\nCCC\nd\ne\n" | sed -n '/b/,/d/p'
  b
  CCC
  d
  b
  CCC
  d

See how it reset above?  But if you only want to print the first set
then you would still want to 'q' to quit the program early.

  $ printf "a\nb\nCCC\nd\ne\na\nb\nCCC\nd\ne\n" | sed -n '/b/,/d/p;/d/q'
  b
  CCC
  d

Hope that helps,
Bob

[1] GNU sed has a bug making it slightly incompatible with the classic
Unix sed program.  In Unix sed if you use 'p' without -n the line is
only printed once.

On HP-UX (System V sed) for example:

  $ echo hello | sed p
  hello

In GNU sed it by default always prints and if there is a command to
print then it prints the line again.

  $ echo hello | sed p
  hello
  hello

For portable use never use p without also using -n to avoid duplicated
lines and use that would not be the same on a traditional Unix system.

Attachment: signature.asc
Description: Digital signature


Reply to: