[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Useless use of shell pipelines [was: Useful use of dd]



On 7/2/21 12:49 AM, tomas@tuxteam.de wrote:

Now to another pet peeve of mine: useless use of grep:

   grep xxxx <myfile> | sed -e 's/bla/foo/' ...


Perhaps the underlying issue is useless use of shell pipelines ("The Unix Way"):

2021-07-02 12:44:43 dpchrist@dipsy ~/sandbox/perl
$ cat useless-use-of-grep
This is the first line: bla bla bla.
This is line xxxx: bla bla bla.
This is the last line: bla bla bla.

2021-07-02 12:45:38 dpchrist@dipsy ~/sandbox/perl
$ grep xxxx useless-use-of-grep | sed -e 's/bla/foo/'
This is line xxxx: foo bla bla.


Versus "all-in-one":

2021-07-02 12:45:39 dpchrist@dipsy ~/sandbox/perl
$ perl -ne 's/bla/foo/ && print if /xxxx/' useless-use-of-grep
This is line xxxx: foo bla bla.


The former is better from a "golfing" standpoint (e.g. fewer characters to type).


But, it is interesting that the benchmark results defy the common perceptions of "userland tools are are fast" and "Perl is slow":

2021-07-02 12:59:55 dpchrist@dipsy ~/sandbox/perl
$ time for n in {1..10000}; do grep xxxx useless-use-of-grep | sed -e 's/bla/foo/' > /dev/null ; done

real	0m42.162s
user	0m32.925s
sys	0m45.787s

2021-07-02 13:00:45 dpchrist@dipsy ~/sandbox/perl
$ time for n in {1..10000}; do perl -ne 's/bla/foo/ && print if /xxxx/' useless-use-of-grep > /dev/null; done

real	0m23.624s
user	0m14.308s
sys	0m10.826s


Perhaps this is because processing is trivial and a pipeline of two commands has twice the process creation costs of a single command (?).


David


Reply to: