On Wed, Nov 08, 2006 at 02:51:20AM +1100, John O'Hagan wrote:
>
> I tried this, and found that replacing the newlines with spaces stops the grep
> from working because it puts spaces in the middle of any occurrences
> of "Processor", but I see what you mean about the edge case. I think this
> version takes care of it, plus it is hyphen-agnostic:
>
> tr -d '\n' <IN | sed s/P-*r-*o-*c-*e-*s-*s-*o-*r/' Processor'/g |
> tr -s ' ' '\n' | grep -B1 'Processor' | grep -v 'Processor\|--'
>
> removing newlines, replacing all cases of (non-)hyphenated "Processor" with a
> space followed by "Processor", then doing the grep. And here's a Python
> version using the re module to deal with the hyphens ( the edge case takes
> care of itself here):
>
> import re
>
> for i in re.split('P-?r-?o-?c-?e-?s-?s-?o-?r',
> open('IN').read().replace('\n', ''))[0:-1]:
> print i.split()[-1]
huh, I'm not sure. I played with it a little and here's another
problem
here is some testing
data processor
will return 'testingdata' because the newlines get stripped out
leaving no space between the words. so..
first, replace all '-\n' with '' so we dehyphenate any hyphenated
words split by a newline. there will be some words that should be
hyphenated but lose that hyphen, however, I think that's probably a
pretty rare case and it ignores any mid-line hyphenated words. also
makes it easier to grep as we can ignore the hyphens in processor next
replace all '\n' with ' ' so that we avoid the above problem. then
replace any single-or-more occurance of ' ' with '\n' to split the
words into seperate lines and finally grep away.
tr -d '-\n' <IN | tr '\n' ' ' | tr -s ' ' '\n' | grep -B1 'Processor'
| grep -v 'Processor\|--'
>
> Have we done this to death yet? :)
there must be more. I haven't seen any perl junkies provide us with
some permutation of ($*&#^&*%^^@@Processor%^&^$%^%#$&^$%*&^% that
spits the answer right out. ^^^^----- that's not perl code BTW, just
random shifted number-row. but it looks like perl eh? hehe
A
Attachment:
signature.asc
Description: Digital signature