[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

All true assertions in a bash find one liner? ...



 I need to find all files which names satisfy a pattern and contain a
certain string, then from those files I need to printf some metadata,
a la:

 find "${_SDIR}" -type f -iregex .*"${_X}" -printf '"%TD
%TT",%Ts,%s,"%P"\n' > "${_TMPFL}" 2>&1

 I am trying to do all steps in one go, which I think should be
possible, but it is not working. Tha tline prints out both all
patterned files using the format specified in the -printf parameter
and (repeatedly) those that match the search inside the content as
paths (no -printf).

 I know I am being silly, since on that statement there are in fact
two searches one on the metadata and one through the content of the
patterned files, but you can see what I mean. There should be a way to
do it in once swoop. Or probably there is another utility to do that.
The thing is that I work on corpora research and very often you need
to search large amounts fo text files in no time.

 Instead of the lines of the first search by the extensions to look like:

"12/15/18 12:14:16.0000000000",1544872456,2542,"OK/OK00/OK00Test.java"
"12/15/18 11:28:49.0000000000",1544869729,85,"OK/OK00/OK00Test_main_cli_UTF-8.properties.txt"
"12/15/18 11:30:45.0000000000",1544869845,296,"OK/logs/OK00Test_20181215053045.0413_err.properties.txt"
"12/15/18 11:35:23.0000000000",1544870123,296,"OK/logs/OK00Test_20181215053523.0420_err.properties.txt"

 I want them formatted a la yyyy-mm-dd hh:mm:ss (or dd.mm.yyyy
hh:mm:ss, or whatever other way):

"2018-12-15 12:14:16",1544872456,2542,"OK/OK00/OK00Test.java"
"2018-12-15 11:28:49",1544869729,85,"OK/OK00/OK00Test_main_cli_UTF-8.properties.txt"
"2018-12-18 11:30:45",1544869845,296,"OK/logs/OK00Test_20181215053045.0413_err.properties.txt"
"2018-12-18 11:35:23",1544870123,296,"OK/logs/OK00Test_20181215053523.0420_err.properties.txt"

# extensions
_X=".\(java\|txt\)"

# search directory
_SDIR="/home/$(whoami)/java"

# search string
_W="java.io.UnsupportedEncodingException;"

# start time
_TM_START=$(date +%s);
_DT=$(date +%Y%m%d%H%M%S)

# log file
_LOG_FL="find_${_DT}.log"
echo "// __ \$_LOG_FL: |${_LOG_FL}|"

_TMPFL="${_LOG_FL%.*}"_$(mktemp .XXXXXX)
echo "// __ \$_TMPFL: |${_TMPFL}|"

find "${_SDIR}" -type f -iregex .*"${_X}" -printf '"%TD
%TT",%Ts,%s,"%P"\n' > "${_TMPFL}" 2>&1

ls -l "${_TMPFL}"
wc -l "${_TMPFL}"

_TMPFL02="${_LOG_FL%.*}"_$(mktemp "02".XXXXXX)
echo "// __ \$_TMPFL02: |${_TMPFL02}|"

_LNS=$(wc -l "${_TMPFL}" | awk '{print $1}')
echo "// __ \$_LNS: |${_LNS}|"

_FND_CNT=0
_IX=0

while read -r _L; do
 _PTH=$(echo "${_L}" | awk  -F '"' '{print $4}')

# echo "// __ [$_IX/$_LNS): |${_L}|${_PTH}|"

 _IFL="${_SDIR}/${_PTH}"

 if [ -s "${_IFL}" ]; then

  _FND_W=$(cat "${_IFL}" | grep "${_W}")

#  echo "// __ [$_IX/$_LNS): |${_L}|${_PTH}|${_FND_W}|"

# not empty string
  if [[ ! -z  ${_FND_W} ]]; then
#   echo "// __ \$_IFL: |${_IFL}|"
   _FND_CNT=$(( _FND_CNT+1 ))

   echo "${_L}" >> "${_TMPFL02}"
  fi

 else
  echo "// __ File not found! \$_IFL: |$_IFL|"
 fi

 _IX=$(( _IX+1 ))
done < "${_TMPFL}"

rm -fv "${_TMPFL}"

#
_TM_END=$(date +%s);
_TM_DIFF=$((_TM_END - _TM_START))

echo "// __ |${_W}| found in |${_FND_CNT}| of |${_LNS}| files in
${_TM_DIFF} seconds"

ls -l "${_TMPFL02}"
wc -l "${_TMPFL02}"

# fix "sort -k ..." as csv line from last/most recent modified down
(reverse) after fixing date format ...
#cat "${_TMPFL02}" | sort -k 3,3nr > "${_LOG_FL}"
cat "${_TMPFL02}" | sort -k 1,1nr > "${_LOG_FL}"

rm -fv "${_TMPFL02}"

ls -l "${_LOG_FL}"
wc -l "${_LOG_FL}"


Reply to: