On Sat, Jun 20, 2020 at 12:20:41PM +0200, Albretch Mueller wrote: > _X=".\(html\|txt\)" > _SDIR="$(pwd)" > > _AR_TERMS=( > Kant > "Gilbert Ryle" > Hegel > ) > > for iZ in ${!_AR_TERMS[@]}; do > find "${_SDIR}" -type f -iregex .*"${_X}" -exec grep -il > "${_AR_TERMS[$iZ]}" {} \; > done # iZ: terms search/grep'ped inside text files; echo "~"; > > > # this would be much faster > > find "${_SDIR}" -type f -iregex .*"${_X}" -exec grep -il > "Kant\|Gilbert Ryle\|Hegel" {} \; > > but how do I know which match happened in order to save it into separate files? Hm. The first approach goes three times through your files, once for each term. The second goes once, for a combined regular expression. So no wonder the second approach is faster. But to actually attack the problem you should be aware that the second method is doing *something different* from the first one: "grep -l" will stop at the first hit, so even if you could ask grep which one of the alternatives it found, it'll miss Hegel in a file where Kant figures first. Is that what you want? Once you have answered that question, you'll be able to proceed. One possibility is postprocessing your output: grep outputs the hit line, and you can match that against the individual terms; you'd have to drop the "-l" for that, making things somewhat slower. Another possibility is to keep the "-l" and to re-grep the files found against all the individual patterns. Cheers -- t
Attachment:
signature.asc
Description: Digital signature