Re: Please help parsing file [sed, awk, fortran, bash]

To: debian-user@lists.debian.org
Subject: Re: Please help parsing file [sed, awk, fortran, bash]
From: Gaël DONVAL <gael.donval@cnrs-imn.fr>
Date: Fri, 31 Aug 2012 15:29:13 +0200
Message-id: <[🔎] 1346419753.4533.20.camel@p76-nom-gd.cnrs-imn.fr>
In-reply-to: <[🔎] 20120831084656.GC17667@debian>
References: <[🔎] CAKUhbgH=Wrfmf1xFti3Gmd8j1zaGOYrch+cWgD-M3EYwgR92Cw@mail.gmail.com> <[🔎] 504006C1.4020608@cloud85.net> <[🔎] 20120831013129.GC3207@gospelbroadcasting.org> <[🔎] k1p6t6$6cn$1@ger.gmane.org> <[🔎] 20120831084656.GC17667@debian>

Le vendredi 31 août 2012 à 09:46 +0100, Jon Dowland a écrit :
> On Fri, Aug 31, 2012 at 02:18:15AM +0000, Mark Blakeney wrote:
> > On Fri, 31 Aug 2012 01:31:29 +0000, Russell L. Harris wrote:
> > > This exercise provides the impetus to learn to use a very useful tool,
> > > namely Perl.
> > 
> > I would suggest python is a much better choice to a young person
> > just starting out.
> 
> Seconded. I wrote some Perl yesterday, for the first time in a while.
> I didn't miss it.

I second that too.

One possibility (in python) would have been:
data = [0] * 1024 
with open("your_file") as infile:
    infile.readline()
    infile.readline()
    for line in infile:
         sline = line.split()
         data[int(sline[0])] = int(sline[1]) if len(sline) > 1 else 1

If your input file was more like:
2883
452
0  7
1  6
2  1
4  1
6  1
10  7
Then:

import numpy as np
data = np.zeros(1024)
infile = np.genfromtxt("your_file", skiprows=2, dtype=[int, int])
data[infile[0]] = infile[1]

Then just access "data" directly using the index:
>>> print data[10]
10

The second example is also calculation-ready.

I guess there already exit a function out there that can correctly
handle missing data in space separated format that would allow a 4 line
parser as the one given above for your "compressed" data. 

Actually, if you really really want to compress data, just drop the
whole ASCII thing. Either use a known binary format (I use HDF5 for
instance) and/or compress your data using a compression program such as
zip/xz/7zip... (or build in in HDF5).

Reply to:

Follow-Ups:
- Re: Please help parsing file [sed, awk, fortran, bash]
  - From: Miles Fidelman <mfidelman@meetinghouse.net>

References:
- Please help parsing file [sed, awk, fortran, bash]
  - From: daniel jimenez <daniel.jimenez.gomez@gmail.com>
- Re: Please help parsing file [sed, awk, fortran, bash]
  - From: Richard Owlett <rowlett@cloud85.net>
- Re: Please help parsing file [sed, awk, fortran, bash]
  - From: "Russell L. Harris" <rlharris@broadcaster.org>
- Re: Please help parsing file [sed, awk, fortran, bash]
  - From: Mark Blakeney <mark.blakeney@bullet-systems.net>
- Re: Please help parsing file [sed, awk, fortran, bash]
  - From: Jon Dowland <jmtd@debian.org>

Prev by Date: Re: XFS TRIM Support on Debian 6.0.5 ?
Next by Date: Re: [ask] google translate client or similar method
Previous by thread: Re: Please help parsing file [sed, awk, fortran, bash]
Next by thread: Re: Please help parsing file [sed, awk, fortran, bash]
Index(es):
- Date
- Thread