[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: All true assertions in a bash find one liner? ...

On Thu, Feb 27, 2020 at 01:40:22PM +0100, Albretch Mueller wrote:
>  I need to find all files which names satisfy a pattern and contain a
> certain string, then from those files I need to printf some metadata,
> a la:
>  find "${_SDIR}" -type f -iregex .*"${_X}" -printf '"%TD
> %TT",%Ts,%s,"%P"\n' > "${_TMPFL}" 2>&1

The quoting is wrong for the regex.  And you probably don't even want
to use a regex.  It would help if we had some clue what's in the _X
variable, and why you're trying to use a regex instead of a standard

Did you simply want all the files whose names end with the contents
of that variable?  If so:

find "$sdir" -type f -name "*$x" -printf '...'

(or -iname if you want it to be case-insensitive).

The way you've got it quoted now, the .*$x bit will be expanded by
the shell against the contents of the CURRENT directory, which is
absolutely NOT what you want.

The glob (or regex) needs to be quoted so that the shell WON'T
expand it, but find WILL.

Generally speaking, always use a glob if a glob will do the job.  Don't
use a regex unless it's absolutely necessary.

>  I am trying to do all steps in one go,

So, start by saying what all of the steps ARE.

Earlier, you mentioned "contain a certain string", but that's really

>  I know I am being silly, since on that statement there are in fact
> two searches one on the metadata and one through the content of the
> patterned files, but you can see what I mean.

We can't really "see what you mean" until you show us.  Why don't you
just tell us the actual problem?  It can't be THAT embarrassing.

"I want to find all of the files whose names end with .txt and which
contain the string penis."

find . -type f -name '*.txt' -exec grep -l penis {} +

> There should be a way to
> do it in once swoop. Or probably there is another utility to do that.
> The thing is that I work on corpora research and very often you need
> to search large amounts fo text files in no time.

So wait, there's a *third* part of the problem?  It has to handle
extremely large inputs, and be very fast?

It sounds like you need highly specialized tools that perform
indexing of your content.  Like Xapian or something.  It's not really
my area of expertise, so someone else may be able to suggest a more
suitable tool set.

Reply to: