Re: Bash, sed: extracting regex subexpressions
> On Tue, May 27, 2008 at 4:49 PM, John O'Hagan <johnmohagan@gmail.com> wrote:
> > Hi,
> >
> > I've been looking for a command I can use in bash scripts that will do
> > something like this:
> >
> > $COMMAND(n[,m...]) (REGEX-1)(REGEX-2)[...] <($FILE)
> >
> > (MATCH-n)[(MATCH-m)...]
>
Thanks for the tips; they all work.
I tried each approach for a time-intensive task: finding palindromes within
words in a dictionary file $DICT, using an identical regex in each case.
Below are the expressions used and the times they took to execute:
while read i ; do
[[ $i =~ '(.*((.)(.?)((.)\6?)\4\3).*)' ]] && echo $BASH_REMATCH
${BASH_REMATCH[2]}
done < $DICT
#real 1m41.239s
#user 1m17.383s
#sys 0m0.474s
--------
sed -nr 's/(.*((.)(.?)((.)\6?)\4\3).*)/\1 \2/p' $DICT
#real 1m6.151s
#user 0m46.763s
#sys 0m0.151s
-------
perl -ne '$_ =~ /(.*((.)(.?)((.)\6?)\4\3).*)/; print "$1, $2\n"' < $DICT
#real 0m16.381s
#user 0m4.660s
#sys 0m0.482s
--------
So I guess Perl is way the winner; unless the above comparison is somehow
unfair?
Regards,
John
Reply to: