[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

[checks/scripts] script_is_evil_and_wrong() and bashisms update



Hi,

I've attached a few patches against current lintian SVN which update
checks/scripts to incorporate some changes that have been made to
checkbashisms recently (there are a few changes I've not included such
as ignoring multi-line quoted text and checking for invalid function
names, as I'd like to see how they stand up to an archive-wide test
first).

I've included what I hope is sufficient description of each to allow you
to judge whether you'd be happy to apply them or run screaming in the
opposite direction. :-)

heredoc-backslash.diff
----------------------

Expands heredoc detection to handle

        cat << \EOT
          foo
          bar
        EOT

I'm not sure /why/ you'd want to do the above, but it doesn't appear to
be uncommon (particularly in various parts of git), I can't see anything
in POSIX that disallows it and both dash and posh support it.

quoted-quotes.diff
------------------

Removes "quoted quotes" (i.e. "'" and '"') from strings before removing
single-quoted strings. This was added to handle this from libtool's
ltmain.sh (specifically /usr/bin/freehdl-libtool from the freehdl
package):

        && $echo "X$libobj" | grep '[]~#^*{};<>?"'"'"'        &()|`$[]'
        \

After applying the patch, the above is reduced to

         && $echo "X$libobj" | grep '[]~#^*{};<>?        &()|`$[]' \

and thus to

         && $echo "X$libobj" | grep '' \

once single-quoted strings have been removed.

scripts_are_more_evil_and_wrong.diff
------------------------------------

The archive-wide checkbashisms runs have revealed more methods of
disguising something as a shell script than script_is_evil_and_wrong()
currently catches.

This diff includes the regex fix from #471333 together with

- Allow arguments to eval to be double- as well as single-quoted
- Increase the number of lines scanned

Both of the above were added to match line 52 of
bastille-firewall-schedule:

        eval "exec ${PERL} -x $0 $*"

- Match lines execing $var if $var has previously been assigned the vale
of $0. Again, I don't know why but /usr/bin/git-citool uses this
construct:

        #!/bin/sh
        # Tcl ignores the next line -*- tcl -*- \
         if test "z$*" = zversion \
         || test "z$*" = z--version; \
         then \
        	echo 'git-gui version 0.9.3.1.g21623'; \
        	exit; \
         fi; \
         argv0=$0; \
         exec '/usr/bin/wish8.5' "$argv0" -- "$@"
        
bashisms.diff
-------------

- Update $FOO checks to also match ${FOO}
- Allow a space between >& and the file descriptor
- Move the herestring (<<<) check to string bashisms to allow matching

        bar="$(cut '-d|' -f2 <<< "$foo")"
        
- Enhance the "read without variable" test to also catch attempts to
pass options other than -r
- Add checks for $SECONDS and $BASH_*
- Add checks for suspend, caller, complete, compgen, declare, typeset,
disown, builtin, set -[BHT], alias -p, unalias -a, local with options or
an assigned value (i.e. "local -foo" or "local foo=bar") and VAR+=foo

Regards,

Adam
--- scripts.orig	2008-03-25 19:04:56.000000000 +0000
+++ scripts	2008-03-25 20:20:44.000000000 +0000
@@ -627,7 +627,7 @@
 
 		# Only look for the beginning of a heredoc here, after we've
 		# stripped out quoted material, to avoid false positives.
-		if (m/(?:^|[^<])\<\<\s*[\'\"]?(\w+)[\'\"]?/) {
+		if (m/(?:^|[^<])\<\<\s*[\'\"\\]?(\w+)[\'\"]?/) {
 		    $cat_string = $1;
 		}
 	    }
--- scripts.orig	2008-03-25 19:04:56.000000000 +0000
+++ scripts	2008-03-25 22:00:47.000000000 +0000
@@ -598,6 +598,13 @@
 		# argument to grep or the like.
 		my $line = $_;
 		unless ($found) {
+		    # Remove "quoted quotes". They're likely to be inside
+		    # another pair of quotes; we're not interested in
+		    # them for their own sake and removing them makes finding
+		    # the limits of the outer pair far easier.
+		    $line =~ s/(^|[^\\\'\"])\"\'\"/$1/g;
+		    $line =~ s/(^|[^\\\'\"])\'\"\'/$1/g;
+
 		    $line =~ s/(^|[^\\](?:\\\\)*)\'(?:\\.|[^\\\'])+\'/$1''/g;
 		    for my $re (@bashism_string_regexs) {
 			if ($line =~ m/($re)/) {
--- scripts.orig	2008-03-25 19:04:56.000000000 +0000
+++ scripts	2008-03-25 19:13:19.000000000 +0000
@@ -736,18 +736,22 @@
     my $ret = 0;
     open (IN, '<', $filename) or fail("cannot open $filename: $!");
     my $i = 0;
+    my $var = "0";
     local $_;
     while (<IN>) {
         chomp;
 	next if /^#/o;
 	next if /^$/o;
-        last if (++$i > 20);
-        if (/(^\s*|\beval\s*\'|;\s*)exec\s*.+\s*.?\$0.?\s*(--\s*)?(\${1:?\+)?.?\$(\@|\*)/o) {
-            $ret = 1;
-            last;
-        }
+	last if (++$i > 55);
+	if (/(^\s*|\beval\s*[\'\"]|;\s*)exec\s*.+\s*.?\$$var.?\s*(--\s*)?.?(\${1:?\+)?\$(\@|\*)/) {
+	    $ret = 1;
+	    last;
+	} elsif (/^\s*(\w+)=\$0;/) {
+	    $var = $1;
+	}
     }
     close IN;
+
     return $ret;
 }
 
--- scripts.orig	2008-03-25 19:04:56.000000000 +0000
+++ scripts	2008-03-25 22:54:52.000000000 +0000
@@ -551,11 +551,14 @@
 		  '\$\{!\w+[\@*]\}',	       # ${!prefix[*|@]}
 		  '\$\{!\w+\}',		       # ${!name}
 		  '(\$\(|\`)\s*\<\s*\S+\s*(\)|\`)', # $(\< foo) should be $(cat foo)
-		  '\$RANDOM\b',		       # $RANDOM
-		  '\$(OS|MACH)TYPE\b',         # $(OS|MACH)TYPE
-		  '\$HOST(TYPE|NAME)\b',       # $HOST(TYPE|NAME)
-		  '\$DIRSTACK\b',              # $DIRSTACK
-		  '\$EUID\b',                  # $EUID should be "id -u"
+		  '\$\{?RANDOM\}?\b',	       # $RANDOM
+		  '\$\{?(OS|MACH)TYPE\}?\b',   # $(OS|MACH)TYPE
+		  '\$\{?HOST(TYPE|NAME)\}?\b'  # $HOST(TYPE|NAME)
+		  '\$\{?DIRSTACK\}?\b'         # $DIRSTACK
+		  '\$\{?EUID\}?\b',             # $EUID should be "id -u"
+		  '\$\{?SECONDS\}?\b',         # $SECONDS
+		  '\$\{?BASH_[A-Z]+\}?\b',     # $BASH_SOMETHING
+		  '<<<',                       # <<< here string
 		);
 		my @bashism_regexs = (
 		  'function \w+\(\s*\)',       # function is useless
@@ -565,11 +568,11 @@
 		  '\s(\|\&)',		       # pipelining is not POSIX
 		  '[^\\\]\{([^\s]+?,)+[^\\\}\s]+\}', # brace expansion
 		  '(?:^|\s+)\w+\[\d+\]=',      # bash arrays, H[0]
-		  '(?:^|\s+)read\s*(?:;|$)',   # read without variable
+		  '(?:^|\s+)(read\s*(-[^r])?(?:;|$))', # should be read [-r] variable
 		  '(?:^|\s+)kill\s+-[^sl]\w*', # kill -[0-9] or -[A-Z]
 		  '(?:^|\s+)trap\s+["\']?.*["\']?\s+.*[1-9]', # trap with signal numbers
 		  '\&>',		       # cshism
-		  '(<\&|>\&)\s*((-|\d+)[^\s;|\)\`&]|[^-\d])', # should be >word 2>&1
+		  '(<\&|>\&)\s*((-|\d+)[^\s;|\)\`&]|[^-\d\s])', # should be >word 2>&1
 		  '\[\[(?!:)',		       # alternative test command
 		  '(?:^|\s+)select\s+\w+',     # 'select' is not POSIX
 		  '\$\(\([A-Za-z]',	       # cnt=$((cnt + 1)) does not work in dash
@@ -578,7 +581,20 @@
 		  '(?:^|\s+)let\s',	       # let ...
 		  '(?<![\$\(])\(\(.*\)\)',     # '((' should be '$(('
 		  '(\[|test)\s+-a',	       # test with unary -a (should be -e)
-		  '<<<',                       # <<< here string
+		  '(?:^|\s+)suspend\s',        # suspend
+		  '(?:^|\s+)caller\s',         # caller
+		  '(?:^|\s+)complete\s',       # complete
+		  '(?:^|\s+)compgen\s',        # compgen
+		  '(?:^|\s+)declare\s',        # declare
+		  '(?:^|\s+)typeset\s',        # typeset
+		  '(?:^|\s+)disown\s',         # disown
+		  '(?:^|\s+)builtin\s',        # builtin
+		  '(?:^|\s+)set\s+-[BHT]+',    # set -[BHT]
+		  '(?:^|\s+)alias\s+-p',       # alias -p
+		  '(?:^|\s+)unalias\s+-a',     # unalias -a
+		  '(?:^|\s+)local\s+-[a-zA-Z]+', # local -opt
+		  '(?:^|\s+)local\s+\w+=',     # local foo=bar
+		  '(?:^|\s+)\w+\+=',           # should be VAR="${VAR}foo"
 		);
 
 		# since this test is ugly, I have to do it by itself

Reply to: