[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Shell game? was Re: pmount could perhaps be of greater utility?



On 5/9/2019 12:34 AM, David Wright wrote:
On Wed 08 May 2019 at 14:08:03 (+0800), KHMan wrote:
On Tue 07 May 2019 at 10:12:10 (+1000), David wrote:
On Mon, 6 May 2019 at 23:53, Erik Christiansen wrote:
On 06.05.19 09:03, Greg Wooledge wrote:
On Sat, May 04, 2019 at 01:48:01PM +0200, Jonas Smedegaard wrote:
[snipped all]
Hi Erik

Maybe you would enjoy answering this question then?
https://lists.gnu.org/archive/html/help-bash/2019-05/msg00000.html

Running the result of a command execution and allowing the result to
control delimiters, dropping out of the string? Now that gives me the
jeebies, security-wise. :-)

I think you can heave a sigh of relief as I think I can show that's
not happening after all. The trick is to add   set -x   to the top
of the script (and I've set -v as well). It does appear (I think)
that the contents of the backquotes are interpreted earlier than
my working showed:
[snip]

Good tip, set -x is useful. I only know simple bash scripting. I am actually doing this to fix shell code syntax highlighting for the Scintilla edit control (Geany, Notepad++, SciTE, etc.) -- for this one I want to get to the bottom of this rather than implement its behaviour without fully understanding _why_.

[snip]
But I'm not sure how to distinguish the order of the interpretation
of \\ and \" in the above.

You need 5 backslashes to get \\x, I was trying such snippets earlier in the week:

$ echo "[` echo \" \\\\\x \" `]"
++ echo ' \\x '
+ echo '[ \\x ]'
[ \\x ]

Going by my theory below, the inner `` string would be read as:
 echo " \\\x "
where \\ -> \ and \" -> ". Then it is executed, and there is another \\ -> \ due to the "", so when the "" string is translated into a literal string, it becomes:
 echo ' \\x '
and the rest follows.

For double quotes:

$ echo "[` echo \" \\" \" `]"
++ echo ' " '
+ echo '[ " ]'
[ " ]

The inner `` string is first read as:
 echo " \" "
because of \\ -> \ and the " in the \\" becomes just a character since it is not an ending delimiter for the `` inner string. When executed, the " \" " string would be equivalent to the ' " ' literal string. The result follows.

If we try \\\":

$ echo "[` echo \" \\\" \" `]"
++ echo ' " '
+ echo '[ " ]'
[ " ]

Here, the inner `` string is first read as:
 echo " \" "
where \\ -> \ and \" -> ". When executed the " \" " string would again be equivalent to the ' " ' literal string. Final result is the same.

This would however cause an error:

$ echo "[` echo \" \\\\" \" `]"

The inner `` string is first read as:
 echo " \\" "
because of two \\ -> \ escapes. Then the " \\" " becomes ' \' plus an extra ".

Five backslashes will fail too, it still results in the inner string:
 echo " \\" "

Six backslashes work. It will give the interim of:
 echo " \\\" "
where " \\\" " end up as ' \" ' and the result is as predicted.


I have since been studying the bash sources, and posted another query
yesterday, see:

http://lists.gnu.org/archive/html/help-bash/2019-05/msg00006.html

To summarize, consider our usual examples:
echo "[` echo \" \\" \" `]" A            # [ " ] A
echo "[` echo \" \\x \" `]" J            # [ \x ] J

Here's a theory: Inside the inner backquotes, \" gets escaped into "
because token processing sees the current delimiter as ". (But matched
pair processing sees the inner delimiters as ``.) The \\" becomes \"
and the \\x becomes \x. The inner commands are then run as:

echo " \" "
echo " \x "

I follow that. Unfortunately, set -x appears not to show the raw line
in that state, but interprets those outer double quotes and then
reports the line in its own single quotes.

giving the expected result. When entering the matched pair processing
function for the inner ``, the delimiter stack was not updated, so the
token function still sees the current delimiter as the outer one,
which is ".

So again it appears to involve the order of interpretation.

If you study the sources, bash does make string parsing calls recursively as expected for this kind of thing. The anomaly I see is at parse.y[3734] for bash-5.0. A call is made to parse the inner `` string while currently parsing a "" string, so it is nesting, _but_ the delimiter stack is not updated.

The read_token_word function uses the delimiter stack to determine escaping (see parse.y[4942] and parse.y[4961]) so inside that inner `` string, it is running the escape behaviour for "" strings. So I am trying to find out if it is intentional. Is the inner `` supposed to be semantically part of the outer "" string? If so, the additional level of escaping due to execution of the `` inner string serves to confuse matters a lot.


This is based on what I have studied in the sources, and it doesn't
make any sense to me from a syntax point-of-view, so I hope I can
eventually get a useful and definitive answer from the bash
maintainers.

Is backquote deprecated yet? :)

Doesn't matter, since editor users will hit this scenario, then the downstream editors will point their fingers at Scintilla and bug tickets will be filed with the Scintilla project. So I am still planning to get some kind of answer from the bash folks.


I saw Greg's followup to your new post; it seems mainly aimed at
outlawing overlapping strings and allowing only nested ones.
I guess, then, that that does prevent delimiting a string by a
quote from one level paired with one "dropping out of" (returned by)
the inner command.

--
Cheers,
Kein-Hong Man (esq.)
Selangor, Malaysia


Reply to: