Re: quick scripting question - finding occurrence in many lines
On Wed, Nov 29, 2006 at 02:32:37PM +0000, michael wrote:
> I guess a complete rephrase is best.
>
> What I want is "how many processors does each WAITING job in lsf queues
> require?". From 'bhist' I get outputs such as below (see whitespace
> anywhere in "num Processors") and cannot determine a sure way of always
> parsing it...
In the brute force perl solution previously shown, just add whitespace
to the character class, [\s\n-], which is inserted between every target
character in the regular expression. This would be similar in awk, sed,
grep, or other tool using regular expressions.
#!/usr/bin/perl -w
use strict;
my $source = join '', <>; # get all the data into a string
my $t = '[\s\n-]'; # define a regexp character class
print "$1\n" while # to be between each character
$source =~ m/(\d+)\s+P$t*r$t*o$t*c$t*e$t*s$t*s$t*o$t*r/msg;
Other schemes previously shown would probably work with trivial changes,
e.g., using tr to delete (-d) or squeeze (-s) runs of spaces or newlines,
etc.
Unless this is a one-off task (which it seems like it isn't), I'd
suggest looking into fixing whatever is generating the screwed-up output
in the first place. Failing that, use tr/sed/python/perl/ruby/BASIC
whatever to filter the output to something more sensible, i.e., normalize
it, and don't try to do it in one step.
Ken
>
> Thanks, Michael
>
> EXAMPLES:
>
>
>
> ~/bin$ bhist -l 10418;bhist -l 10587;bhist -l 10601
>
> Job <10418>, Job Name <3d>, User <mbexddg5>, Project <default>, Command
> <#BSUB
> -n 128;#BSUB -W 6:00;#BSUB -J 3d;#BSUB -o %
> J.out;#BSUB -w
> 'ended(10417)';./cont>
> Tue Nov 28 21:35:48: Submitted from host <horace3>, to Queue <parallel>,
> CWD <$
> HOME/scratch/3d_newgc>, Output File <%J.out>, 128
> Processo
> rs Requested, Dependency Condition <ended(10417)>;
>
> RUNLIMIT
> ...
--
Ken Irving, fnkci@uaf.edu, 907-474-6152
Water and Environmental Research Center
Institute of Northern Engineering
University of Alaska, Fairbanks
Reply to: