Ron Johnson wrote: > On Sun, 2005-03-06 at 02:16 +0100, Olle Eriksson wrote: > > Can anyone help me with how to find the common textual denominator of an > > array of strings. I have been searching the web and the man pages of > > grep, awk etc to no avail. > > > > Given the following list of directory names I want to have a script return > > "Eric_Clapton". > > > > Eric_Clapton-Big_Boss_Man-2CD-Retail-2002-DGN/ > > Eric_Clapton-Higher_Ground-(CDS)-2003-RNS/ > > Eric_Clapton_-_Me_and_Mr_Johnson-(PROPER)-CD-2004-TN/ > > Eric_Clapton-One_More_Car_One_More_Rider-2CD-2002-RNS/ > > Eric_Clapton - Pilgrim/ > > You want a generic algorithm? Assuming word splitting is ok and you want to avoid O(N^2) methods: joey@dragon:~>cat foo foo by Clapton, Eric Eric_Clapton-Big_Boss_Man-2CD-Retail-2002-DGN/ Eric_Clapton-Higher_Ground-(CDS)-2003-RNS/ Eric_Clapton_-_Me_and_Mr_Johnson-(PROPER)-CD-2004-TN/ Eric_Clapton-One_More_Car_One_More_Rider-2CD-2002-RNS/ Eric_Clapton - Pilgrim/ joey@dragon:~>perl -e 'while (<>) { my %seen; foreach my $w (split /[^a-zA-Z0-9]/) { next unless length $w; $count{$w}++ unless $seen{$w}; $seen{$w}=1 } }; foreach (keys %count) { $max=$count{$_} if $max < $count{$_} }; foreach (keys %count) { print "$_\n" if $count{$_} == $max }' < foo Clapton Eric -- see shy jo
Attachment:
signature.asc
Description: Digital signature