[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

[checks/scripts] [PATCH 2/4] heredoc detection



Hi,

This is the second of a series of patches updating checks/scripts
bashisms checks with some changes that have been made to checkbashisms
over the past few months.

This patch assumes (due to the context included in the diff) that the
previous patch has already been applied.

Hopefully all the patches are self-explanatory, but please let me know
if there are any questions or issues.

Regards,

Adam

heredoc.diff
----------------------

1) Expands heredoc detection to handle

        cat << \EOT
          foo
          bar
        EOT

I'm not sure /why/ you'd want to do the above, but it doesn't appear to
be uncommon (particularly in various parts of git), I can't see anything
in POSIX that disallows it and both dash and posh support it.

2) Allows the use of heredoc delimiters containing non-word characters
so long as they're quoted. (Seen "in the wild" in a script using "<<
';'").

3) Matches heredocs using <<-

4) Require the end of a heredoc to be a line containing only the
delimiter; POSIX specifies that the line "foobar" does not end a heredoc
delimited by "foo".

5) Allow heredoc delimiters to contain meta-characters.

6) Add tests for all of the above and include the fix I mentioned
in <[🔎] 1213304569.14324.50.camel@kaa.jungle.aubergine.my-net-space.net>
--- checks/scripts.orig	2008-06-13 11:11:31.000000000 +0100
+++ checks/scripts	2008-06-13 11:17:31.000000000 +0100
@@ -530,7 +530,7 @@
 	}
 
 	if ($shellscript) {
-	    if ($cat_string ne "" and m/^$cat_string/) {
+	    if ($cat_string ne "" and m/^\Q$cat_string\E$/) {
 		$cat_string = "";
 	    }
 	    my $within_another_shell = 0;
@@ -595,6 +595,22 @@
 		# Ignore anything inside single quotes; it could be an
 		# argument to grep or the like.
 		my $line = $_;
+
+		# $cat_line contains the version of the line we'll check
+		# for heredoc delimiters later. Initially, remove any
+		# spaces between << and the delimiter to make the following
+		# updates to $cat_line easier.
+		my $cat_line = $line;
+		$cat_line =~ s/(<\<-?)\s+/$1/g;
+
+		# Remove single quoted strings, with the exception that we
+		# don't remove the string
+		# if the quote is immediately preceeded by a < or a -, so we
+		# can match "foo <<-?'xyz'" as a heredoc later
+		# The check is a little more greedy than we'd like, but the
+		# heredoc test itself will weed out any false positives
+		$cat_line =~ s/(^|[^<\\\"-](?:\\\\)*)\'(?:\\.|[^\\\'])+\'/$1''/g;
+
 		unless ($found) {
 		    # Remove "quoted quotes". They're likely to be inside
 		    # another pair of quotes; we're not interested in
@@ -615,6 +631,7 @@
 
 		# We've checked for all the things we still want to notice in
 		# double-quoted strings, so now remove those strings as well.
+		$cat_line =~ s/(^|[^<\\'\\-](?:\\\\)*)\"(?:\\.|[^\\\"])+\"/$1""/g;
 		unless ($found) {
 		    $line =~ s/(^|[^\\\'](?:\\\\)*)\"(?:\\.|[^\\\"])+\"/$1""/g;
 		    for my $re (@bashism_regexs) {
@@ -632,8 +649,9 @@
 
 		# Only look for the beginning of a heredoc here, after we've
 		# stripped out quoted material, to avoid false positives.
-		if (m/(?:^|[^<])\<\<\s*[\'\"]?(\w+)[\'\"]?/) {
+		if ($cat_line =~ m/(?:^|[^<])\<\<\-?\s*(?:[\\]?(\w+)|[\'\"](.*?)[\'\"])/) {
 		    $cat_string = $1;
+		    $cat_string = $2 if not defined $cat_string;
 		}
 	    }
 	    if (!$cat_string) {
--- testset/maintainer-scripts/debian/postinst.orig	2008-06-13 10:38:49.000000000 +0100
+++ testset/maintainer-scripts/debian/postinst	2008-06-13 11:18:33.000000000 +0100
@@ -104,10 +104,37 @@
 EOF
 
 # But this isn't.
-cat '>>EOF'
+cat '<<EOF'
 echo "All of the array is: ${H[@]}"
 EOF
 
+# This is a heredoc
+cat <<-EOF
+echo "All of the arry is ${H[@]}"
+EOF
+
+# As is this
+cat <<';'
+echo "All of the array is ${H[@]}"
+;
+
+# and this
+cat <<foo
+echo "All of the array is ${H[@]}"
+foobar
+echo $HOSTNAME
+foo
+
+# and again
+cat <<\bar
+echo "All of the array is ${H[@]}"
+bar
+
+# yet another
+cat <<"x++"
+echo "All of the array is ${H[@]}"
+x++
+
 # Recognize single quotes even if they start at the beginning of a line.
 echo not a bashism \
 '/{ptex,tex}/{amstex,plain,generic,}'
--- testset/tags.maintainer-scripts.orig	2008-06-13 10:39:25.000000000 +0100
+++ testset/tags.maintainer-scripts	2008-06-13 11:19:19.000000000 +0100
@@ -37,7 +37,7 @@
 W: maintainer-scripts: ancient-dpkg-multi-conrep-check preinst:10
 W: maintainer-scripts: ancient-dpkg-predepends-check preinst:7
 W: maintainer-scripts: config-does-not-load-confmodule
-W: maintainer-scripts: deprecated-chown-usage postinst:138 'chown -R root.root'
+W: maintainer-scripts: deprecated-chown-usage postinst:165 'chown -R root.root'
 W: maintainer-scripts: deprecated-chown-usage postinst:33 'chown root.root'
 W: maintainer-scripts: gconftool-used-in-maintainer-script postinst:68
 W: maintainer-scripts: init.d-script-not-marked-as-conffile /etc/init.d/foo
@@ -50,16 +50,16 @@
 W: maintainer-scripts: missing-debconf-dependency
 W: maintainer-scripts: no-debconf-templates
 W: maintainer-scripts: possible-bashism-in-maintainer-script postinst:108 '${H[@]}'
-W: maintainer-scripts: possible-bashism-in-maintainer-script postinst:116 'echo -e'
-W: maintainer-scripts: possible-bashism-in-maintainer-script postinst:117 '${!foo}'
-W: maintainer-scripts: possible-bashism-in-maintainer-script postinst:119 'select foo'
-W: maintainer-scripts: possible-bashism-in-maintainer-script postinst:122 '    exec -l'
-W: maintainer-scripts: possible-bashism-in-maintainer-script postinst:123 '    exec -c'
-W: maintainer-scripts: possible-bashism-in-maintainer-script postinst:124 '    exec -a'
-W: maintainer-scripts: possible-bashism-in-maintainer-script postinst:126 'let '
-W: maintainer-scripts: possible-bashism-in-maintainer-script postinst:127 'test -a'
-W: maintainer-scripts: possible-bashism-in-maintainer-script postinst:128 '$RANDOM'
+W: maintainer-scripts: possible-bashism-in-maintainer-script postinst:143 'echo -e'
+W: maintainer-scripts: possible-bashism-in-maintainer-script postinst:144 '${!foo}'
+W: maintainer-scripts: possible-bashism-in-maintainer-script postinst:146 'select foo'
+W: maintainer-scripts: possible-bashism-in-maintainer-script postinst:149 '    exec -l'
 W: maintainer-scripts: possible-bashism-in-maintainer-script postinst:15 '. /usr/share/lintian/shell foo'
+W: maintainer-scripts: possible-bashism-in-maintainer-script postinst:150 '    exec -c'
+W: maintainer-scripts: possible-bashism-in-maintainer-script postinst:151 '    exec -a'
+W: maintainer-scripts: possible-bashism-in-maintainer-script postinst:153 'let '
+W: maintainer-scripts: possible-bashism-in-maintainer-script postinst:154 'test -a'
+W: maintainer-scripts: possible-bashism-in-maintainer-script postinst:155 '$RANDOM'
 W: maintainer-scripts: possible-bashism-in-maintainer-script postinst:18 'read'
 W: maintainer-scripts: possible-bashism-in-maintainer-script postinst:20 'H[0]='
 W: maintainer-scripts: possible-bashism-in-maintainer-script postinst:21 '${H[0]}'
@@ -79,5 +79,5 @@
 W: maintainer-scripts: postrm-does-not-purge-debconf
 W: maintainer-scripts: postrm-has-useless-call-to-ldconfig
 W: maintainer-scripts: read-in-maintainer-script postinst:18
-W: maintainer-scripts: start-stop-daemon-in-maintainer-script postinst:132
+W: maintainer-scripts: start-stop-daemon-in-maintainer-script postinst:159
 W: maintainer-scripts: update-alternatives-remove-called-in-postrm

Reply to: