[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: find'ing files containing certain words (all of them) ...



On 21 September 2013 19:22, Albretch Mueller <lbrtchx@gmail.com> wrote:

> the short bash script bellow you can use to find text files
> containing one word, but my attempts at trying to make it find more
> than one word within the same file haven't been successful

Your question is not at all specific to Debian, so really it is offtopic
here. Ok maybe you are using Debian, but the question is not *about*
the Debian distribution you happen to be using. Your question is
about bash scripts and grep, which run in many other places than
Debian. The big benefit for you in understanding this point is that
you will reach a more suitable audience and be more likely to get
the help you want if you find a forum about bash scripting, or grep,
and ask there.

The script you provided has an ugly style that I find hard to read, so I
spent 5 seconds looking at it and then decided it was not going to be
fun, and stopped. I mention this not to criticise you, but to help you
understand our conversation.

Also I find your question unclear. It took too much effort for me to
figure out exactly what you are asking, so I gave up and had to guess.
I guessed like this:

> You can find all files containing either "import" or
> "BufferedReader", but not both words in the same file.

I gather from this sentence that you want a script that can search text
files to find only files that contain all words in a set of words, and that
those words can occur in any order in the file.

> I want to do just one search per file

I think 'grep' cannot can do this in only one invocation, because I
think it has no way to specify all words without giving them an order.
So I wrote the below bash script that might help you. It prints only
filenames that contain all words in wordlist.

It generates 3 example files tbm.txt, tb.txt, t.txt and searches for the
only one that contains all 3 words: "three" "blind" "mice".

#!/bin/bash

# require bash version 4
if [[ "${BASH_VERSION:0:1}" != 4 ]] ; then
    printf "This script requires Bash version 4\n"
    exit
fi

# require nullglob set
shopt -s nullglob

# create some demo files
echo "three blind mice" >tbm.txt
echo "three blind" >tb.txt
echo "three" >t.txt

# files to search
files=( *.txt )

# words to search for
wordlist=( three blind mice )

# search files for each word
for word in "${wordlist[@]}" ; do
    if [ -n "${files[*]}" ] ; then
        # keep only files that contain current word
        mapfile -t files < <(grep -l "${word}" "${files[@]}")
    fi
done

# print remaining files
printf -- "%s\n" "${files[@]}"


Reply to: