Re: delete lines that contain duplicated column items

On 4/4/07, Ken Irving <fnkci@uaf.edu> wrote:

On Tue, Apr 03, 2007 at 08:19:10PM +0800, Jeff Zhang wrote:
> I have a simple txt file, like:
> ...
> a a
> aa a
> b b
> ba b
> ...
>
> I want to just keep the lines that first appeared in column 2 and delete the follow lines that
> contain duplicated ones in column 2.
> then it will like:
> ...
> a a
> b b
> ...
>
> I've tried `uniq -f1`  but it didn't work.

awk's associative arrays (or perl's hashes) are good for this
sort of thing, e.g.,

  $ awk '!seen[$2]{print; seen[$2]=1}' < file

awk scans input line by line, checks for match condition(s), and
performs the associated actions.  Here, if the 2nd column value hasn't
been seen, print it and record it as seen.

--
Ken Irving

Reply to: