Re: FOSS tool to do general stats from text indata

To: debian-user@lists.debian.org
Subject: Re: FOSS tool to do general stats from text indata
From: dvalin@internode.on.net
Date: Wed, 28 Jun 2023 22:05:46 +0930
Message-id: <[🔎] 75f5948136610850f520a91f5170608866efa4b8@webmail.internode.on.net>
Reply-to: dvalin@internode.on.net

Emanuel Berg wrote:

> Is there a CLI and FOSS tool that creates stats from text
> indata - e.g.,
>
> $ txt2stats path/to/indata/*.txt
>
> I mean a general tool, but with options to tweak the report
> included, of course.

As "stats" is a grab bag larger inside than the Tardis, I suspect that

only on that other ship with the infinite improbability drive is a stats

babelfish interpreter to be found.

For the last 30+ years, I've just thrown together a few lines of Awk

to generate the initially required stats, then tweaked the C-like code

and regexes to add the inevitable nice-to-haves. Some result is

immediate, and dissatisfaction with completeness motivates the

tweaking/temporary_satisfaction cycle. Options are limitless, as is

needed for an undefined task.

There is no need for looping code; just a list of:

/pattern/ {action}

statements is sufficient.

BEGIN {action} # Runs first

END {actions} # Is where you postprocess and print.

Awk's associative arrays take string subscripts, so

/elephant/ { animals[elephant]++ }

accumulates that stat. If you have prefilled the array

animal_list with the names all animals of interest, then

in an action,

for (i=1;i<=NF;i++) # Iterate over the line's fields.

( if ( $i in animal_list) animals[$i]++ )

should accumulate a frequency histogram of 'em all.

Job done. In essentially one line of script.

A quick search for "GAWK: Effective AWK Programming"

should snarf more know-how than most folk desire.

And if you'd like to run it as a daemon, crunching data

coming from a coprocess, there's gawkinet.

It does not seem worthwhile to wade into a swamp after

alligators, shod only in ill-fitting boots made for someone

else. Go for one with steel toecaps and the Swiss army

knife in the heel.

Good luck.

Erik

P.S. It's 15 years since I did this stuff for money, so it's

worth checking the syntax of the old wetware dredgings,

above.

Reply to:

Follow-Ups:
- Re: FOSS tool to do general stats from text indata
  - From: Emanuel Berg <incal@dataswamp.org>

Prev by Date: Re: php 7.4 and bookworm
Next by Date: RFP: onlyoffice -- both online and offline desktop editors
Previous by thread: Re: FOSS tool to do general stats from text indata
Next by thread: Re: FOSS tool to do general stats from text indata
Index(es):
- Date
- Thread