[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: cat and pipelines, mostly (was Re: Delete all after a pattern)



On Sat, Aug 31, 2019 at 10:39:00AM -0400, The Wanderer wrote:
> On 2019-08-31 at 10:07, Roberto C. Sánchez wrote:
> 
> I actually think this is good behavior. The only obvious places to put
> the cursor when doing command history are at the beginning of the line
> and at the end, and for convenience-of-editing purposes, the end seems
> obviously preferable to the beginning.
> 
> What possible shell behavior would you suggest that might be better in
> this regard?
> 
When viewing the immediate prior command in history the cursor could
placed at the same position on the line as where it was found when the
command was executed (most likely because the user presed 'enter').

> >> The further from the rightmost position the part you want to edit
> >> is, the less convenient it is to do that editing, especially when
> >> doing multiple trial-and-error passes to figure out what syntax
> >> will actually produce the desired result.
> > 
> > Again, it sounds like a better shell is in order.
> 
> Again, better how?
> 
Better in that it behaves in the way you think it should.

> > Here is an alternative that places the interesting commands as far
> > to the right as possible:
> > 
> > $  i=~/test.txt; o=~/other_test.txt; (sed 's/config=.*$/config=/g' |
> > tr -d '=') <${i} >${o}
> 
> <snip>
> 
> > That minimizes the distance between the end of the processing
> > command pipeline and the end of the line.  You can add spaces before
> > and/or after ';', '(', and ')' to create more explicit word
> > boundaries if you like.
> 
> That's not a word boundary for the purposes of the keybindings with
> which I'm familiar (and which seem to come enabled by default, at least
> with bash; I think they actually come from readline). For that purpose,
> only alphanumerics count as part of a word, and only the boundary
> between a block of alphanumerics and whitespace counts as the edge of a
> word.
> 
> That means falling back to one-character-at-a-time cursor positioning,
> with the arrow keys, rather than being able to jump around in larger
> blocks. Doable, but less convenient than not needing to.
> 
That particular problem still exists in your preferred approach.

> (That syntax is also again more complicated to type than the simple form
> I tend to use.)
> 
But earlier you said that you type it once, then just hit the up arrow,
tweak a few characters, and execute again until you obtain the desired
result by trial and error.  In that case you are only typing the command
once.  Trying to optimize for simplicity one time one way while avoiding
other optimizations that yield a benefit every time seems rather
confusing to me.

> >> But having to jump back several stages along the command line, and
> >> not even to a point which is at the edge of a 'word' according to
> >> what (at least) the keybindings I'm familiar with recognize, is IMO
> >> not worth the tradeoff vs. saving a single process per invocation.
> > 
> > You'll notice that it's not about saving a process.  The better way 
> > involves a subshell '()' which will create a new process.
> 
> You're right, I had missed that.
> 
> In that case, I fail to see the benefit. As far as I can tell, saving a
> process is the entire sum-total benefit of avoiding the use of cat in
> this type of context.
> 
Then I am going to guess you have not had to maintain that many shell
scripts.  As with any programming language, lowering the cognitive
burden on the maintainer of a shell script should be a goal.  The more
time I have to spend separating functional logic (like the sed
transformation) from non-functional logic (like here is where the input
originates and here is where the output goes) the more likely I am to
make a logic error and also the longer tasks will take in general.

In any event, the benefit of saving the extra process can be realized by
Teemu's suggestion elsewhere in this thread of placing the input
redirection at the start of the command.

> >>> $ cat ~/other_test.txt 
> >>> Test config
> >>> Test config
> >>> Test config
> >>> 
> >>> Now I can add pipe stages within the sub-shell to my hearts
> >>> content and I can even do other things like replace "<~/test.txt"
> >>> with "<$(some other command that queries a database)" so that the
> >>> input does not even need to come from a real file.
> >> 
> >> Isn't it just as easy to replace 'cat test.txt |' with 'some other 
> >> command that queries a database |' ?
> > 
> > It is just as easy. However, 'cat' is not a value-added part of the 
> > processing pipeline. So, why have it at all?
> 
> Because of the value which is lost by avoiding it, which including it
> preserves.
> 
> That value consists of both the convenience of editing from the end of
> the line, and the intuitiveness of having the input come in at the start
> and the output go out at the end.
> 
You're repeating yourself here.

If proximity to the end of the command line is a priority for the
functional logic, then do you use a variable to contain the name of the
output file?

o=~/out.txt; ..... >${o}

Even for short file names that saves you several characters.  For long
file names the savings are even better.

It seems like your argument is meant rather to keep you using your
specific constructs that you like even when based on your criteria there
are better constructs available.

That's fine, if you like your constructs, then keep using them.
However, it is somewhat disingenuous to argue a point (like you want
your functional command as close to the end of the command line as you
can get it) then ignore a better approach that actually gets closer to
your ideal than your current approach.

> You seem to disagree that the latter is beneficial, or even necessarily
> intuitive, so of course you prefer different constructs. There's nothing
> wrong with that.
> 

Agreed.

> I just get tired of having to explain my own preferred constructs
> labelled incorrect and inferior, rather than simply suited for different
> interpretations of intuitiveness.
> 
Except that you are not simply arguing, "I like it this way because it
is my preference."  You are arguing, "this way is better for these
reasons."  Then when you are shown alternatives that better accomplish
the things you claim are better about your approach, you repeat your
arguments.

> >> And that approach preserves the intuitiveness of having the input
> >> be specified at the start of the command line, and the output at
> >> the end, instead of the input and the output both being specified
> >> at the end.
> > 
> > What is intuitive is not always right or best.  It is better to
> > properly learn the features and functions of the shell or other
> > environment so that proper separation can be made between business
> > logic and supporting structures.
> 
> What makes it "proper"?
> 
> As far as I can tell, it's convenience and intuitiveness which define
> "proper" in this context.
> 
Perhaps.  However, it seems that we have differing views on convenience
and intuitiveness, which is fine.

> > If the command being worked on ends up in a script it is much easier
> > to make the better form I suggested readable and maintainable than it
> > is with the more 'intuitive' version.
> 
> Actually, some parts of my position on this come from work in writing
> scripts.
> 
> I started out one particular script using a form which avoids the use of
> cat, with a multi-stage pipeline split onto multiple lines, so that I
> could move lines around conveniently with kill-and-yank (and have the
> result show up readably in diffs).
> 
> Then I discovered that when I needed to insert a new command at the
> beginning of the pipeline, I couldn't just kill-and-yank, I had to also
> edit the syntax of the previous first line.
> 
> So I inserted a cat invocation as that first line, and thereafter I
> could move lines around just as intended, without needing to touch the
> first line again.
> 
The same effect could have been achieved by making the first line a '('
and the last as ')' for the subshell enclosure.  However, your original
description mentioned nothing regarding working in a multi-line context.

Don't misunderstand me.  I sometimes start interactive command lines
with 'cat' followed by a pipe.  When I do that it is out of convenience
more than anything else.  However, if I am writing something intended
for a script and/or trying to optimize for the criteria you earlier
described, then starting with 'cat' with just unncessary clutter that
makes it harder to quickly discern what is really happening.

Regards,

-Roberto

-- 
Roberto C. Sánchez


Reply to: