[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: codesearch across lines



On 11/12/2017 08:23 AM, Curt wrote:
On 2017-11-12, The Wanderer <wanderer@fastmail.fm> wrote:

(?m)(\W|^)panda.*str(\W|$)

That would be expected to find only documents containing 'panda'
followed by 'str'. To also find ones which contain 'str' followed by
'pandas' (and add the missing 's' back in), you'd probably want:

(?m)(\W|^)(pandas.*str|str.*pandas)(\W|$)

I have not tested this, but I use similar '(a.*b|b.*a)' regexes on a
semi-regular basis for searching one of my own text archives.

I tried that, actually, following the same logic, or thought I did
(there might have been a typo somewhere) yet it produced *less* results,
but trying the formula again now it seems to "work" (although the regex
is pretty useless because it matches reams and reams of stuff because
'anda' 'panda' 'pandas' 'expandable' 'str' 'struct' 'instruction' 'castration',
etc. are all matched).

This produces two hits (from the same file) only:

 (?m)(\W|^)\bpanda\b.*\bstr\b|\bstr\b.*\bpanda\b(\W|$)

I'm not certain how you're supposed to construct the formula to only match
the literal strings (if literal is indeed the term) "panda" and "str".

I also have no idea what the expected results might look like. To add
insult insult to injury, I know nothing about regexes either.

;-)


(Also, I'm not sure the '\W' bits are needed, but I don't know the field
of what's-being-searched-for well enough to be certain about why those
may have been added.)




To paraphrase, "A code fragment is worth a thousand bytes of descriptive text" ;/

Post a half dozen lines of code followed by the desired output.
The posted lines should be < 30 characters to prevent confusion caused by line wrap problems when displayed. The example meed not be valid code in language used -- only character sequences are of interest.




Reply to: