[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: one liner, how do you know which match happened ...



On Sat 20 Jun 2020 at 12:20:41 (+0200), Albretch Mueller wrote:
> _X=".\(html\|txt\)"
> _SDIR="$(pwd)"
> 
> _AR_TERMS=(
> Kant
> "Gilbert Ryle"
> Hegel
> )
> 
> for iZ in ${!_AR_TERMS[@]}; do
>  find "${_SDIR}" -type f -iregex .*"${_X}" -exec grep -il
> "${_AR_TERMS[$iZ]}" {} \;
> done # iZ: terms search/grep'ped inside text files;  echo "~";
> 
> 
> # this would be much faster
> 
> find "${_SDIR}" -type f -iregex .*"${_X}" -exec grep -il
> "Kant\|Gilbert Ryle\|Hegel" {} \;
> 
> but how do I know which match happened in order to save it into separate files?
> 
>  grep doesn't do replacements:
> 
>  https://stackoverflow.com/questions/16197406/grep-regex-replace-specific-find-in-text-file
> 
>  but at least (in my way to understand reality, since it must try such
> searches sequentially) it should give  you the index of the match and
> if grep doesn't do that I am sure some other batch utility would (I
> havenever used sed in my code)

This script apparently has to potentially open one file for each
search string in _AR_TERMS. So why does the Subject line start with
"one liner"? Are you competing in some sort of obfuscation contest?

There's no point in opening these files on the fly because their
number is limited only by the number of search strings, so their
future contents (the matched filenames) should be stored in lists
which themselves should be in an array of such lists, indexed by
search string. Choose your language for implementing this (there
are plenty) rather than trying to beat grep into submission.

One might assume that the sensible solution will involve searching
each file for multiple strings, rather than rescanning all the files
for each search string. If the latter were necessary, it would make
sense to read the files into memory structures (lists/hashes etc),
and search them there. It makes more sense to me to consider these
factors before playing about with fragments of shell code.

Cheers,
David.


Reply to: