[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Uniq is not unique ?



"Chris Henry" <chrishenry.ni@gmail.com> writes:

> Uniq only filters consecutive repeated lines, e.g.
> 
> A
> A
> B
> A
> 
> will become
> 
> A
> B
> A
> 
> If you need it to filter such that only 1 unique line remains, you
> will need to sort first then pipe to uniq (not a good solution for
> really large files).

I sometimes need to filter repeated lines that are not consecutive,
and I use the following simple perl script for this purpose.  Runs
reasonable fast even for large (couple of tens of MB) files:

#!/usr/bin/perl

while (<>) {
    if (!$h{$_}) {
        $h{$_} = 1;
        print;
    }
}

HTH,
urs


Reply to: