[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: quick scripting question - finding occurrence in many lines



On Wed, 2006-11-29 at 09:36 -0900, Ken Irving wrote:
> On Wed, Nov 29, 2006 at 02:32:37PM +0000, michael wrote:
> > I guess a complete rephrase is best. 
> > 
> > What I want is "how many processors does each WAITING job in lsf queues
> > require?". From 'bhist' I get outputs such as below (see whitespace
> > anywhere in "num Processors") and cannot determine a sure way of always
> > parsing it...
> 
> In the brute force perl solution previously shown, just add whitespace
> to the character class, [\s\n-], which is inserted between every target
> character in the regular expression.  This would be similar in awk, sed, 
> grep, or other tool using regular expressions.
> 
>     #!/usr/bin/perl -w
>     use strict;   
>     my $source = join '', <>;  # get all the data into a string
>     my $t = '[\s\n-]';         # define a regexp character class
>     print "$1\n" while         #   to be between each character
>         $source =~ m/(\d+)\s+P$t*r$t*o$t*c$t*e$t*s$t*s$t*o$t*r/msg;
> 
> Other schemes previously shown would probably work with trivial changes,
> e.g., using tr to delete (-d) or squeeze (-s) runs of spaces or newlines,
> etc.
> 
> Unless this is a one-off task (which it seems like it isn't), I'd
> suggest looking into fixing whatever is generating the screwed-up output
> in the first place. Failing that, use tr/sed/python/perl/ruby/BASIC
> whatever to filter the output to something more sensible, i.e., normalize
> it, and don't try to do it in one step.
> 
> Ken

Getting rid of all white space brings own problems - as in 'bla 12
Processors' becomes one (no white space) string.

It's a sort of one-off but I can't fix the LSF queuing system. However,
here's a fix that works (given I know num of whitespace on 2nd, etc,
lines):

function getWAITinfo() {
  echo $jobNum\: `bhist -l $jobNum|sed 's/                     //g'|sed
's/ ^//'
|tr -d '\n'|tr ' ' '\n'|grep -B1 Processors|tr '\n' ' '`
}

where `bhist -l` is what generates the info to be parsed.

Thanks to all.

Michael



Reply to: