[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: help parsing output



On Tue, Aug 03, 2010 at 12:12:26PM -0700, Dino Vliet wrote:
>  Dear debian people,
> 
>  Can you help me with this task I have? I have a lot of files in a subdirectory
>  containing the following text:

You should use awk.

- cut -

>  I need to parse this file to get in a csv file the following information:
> 
>  Correctly Classified Instances, Kappa statistic, Total Number of Instances,
>  Precision {1}, Recall {1}, F-Measure {1},Precision {2}, Recall {2}, F-Measure
>  {2},Precision {3}, Recall {3}, F-Measure {3},a,b,c,a,b,c,a,b,c
>  56.6808, 0.2443, 5324760, 0.681,0.618,0.648,0.617,0.519,0.564,
>  0.056,0.296,0.094,1784321,684983,416649,787342,1190428,314537,49255,53877,43368
> 
>  Does anyone have an idea how this could be accomplished?
>  I not that great in programming so writing a ruby or shell script do do this
>  would take me weeks:-(

A starting in Awk for processing a single file would be:

BEGIN {
  n_equals = 0;
}

n_equals == 0 && /Correctly Classified/ {
  CCI = $(NF - 2);
}

n_equals == 0 && /Incorrectly Classified/ {
  ICI = $(NF - 2);
}

n_equals == 0 && /Kappa statistic/ {
  KS = $NF
}

…

/ ===/ { n_equals = n_equals + 1 }

n_equals == 1 && /TP Rate/ {
  next;
}

// More complicated processing

END {
  printf "%d,", CCI
  printf "%d,", ICI
  printf "%f", KS
  …
}

You ought to read the Awk manual, and then it would be a mattle of a
couple of hours of thought at most.

HTH.

Kumar
-- 
"Even more amazing was the realization that God has Internet access.  I
wonder if He has a full newsfeed?"
(By Matt Welsh)


Reply to: