Re: Please help parsing file [sed, awk, fortran, bash]
Le vendredi 31 août 2012 à 09:46 +0100, Jon Dowland a écrit :
> On Fri, Aug 31, 2012 at 02:18:15AM +0000, Mark Blakeney wrote:
> > On Fri, 31 Aug 2012 01:31:29 +0000, Russell L. Harris wrote:
> > > This exercise provides the impetus to learn to use a very useful tool,
> > > namely Perl.
> >
> > I would suggest python is a much better choice to a young person
> > just starting out.
>
> Seconded. I wrote some Perl yesterday, for the first time in a while.
> I didn't miss it.
I second that too.
One possibility (in python) would have been:
data = [0] * 1024
with open("your_file") as infile:
infile.readline()
infile.readline()
for line in infile:
sline = line.split()
data[int(sline[0])] = int(sline[1]) if len(sline) > 1 else 1
If your input file was more like:
2883
452
0 7
1 6
2 1
4 1
6 1
10 7
Then:
import numpy as np
data = np.zeros(1024)
infile = np.genfromtxt("your_file", skiprows=2, dtype=[int, int])
data[infile[0]] = infile[1]
Then just access "data" directly using the index:
>>> print data[10]
10
The second example is also calculation-ready.
I guess there already exit a function out there that can correctly
handle missing data in space separated format that would allow a 4 line
parser as the one given above for your "compressed" data.
Actually, if you really really want to compress data, just drop the
whole ASCII thing. Either use a known binary format (I use HDF5 for
instance) and/or compress your data using a compression program such as
zip/xz/7zip... (or build in in HDF5).
Reply to:
- References:
- Please help parsing file [sed, awk, fortran, bash]
- From: daniel jimenez <daniel.jimenez.gomez@gmail.com>
- Re: Please help parsing file [sed, awk, fortran, bash]
- From: Richard Owlett <rowlett@cloud85.net>
- Re: Please help parsing file [sed, awk, fortran, bash]
- From: "Russell L. Harris" <rlharris@broadcaster.org>
- Re: Please help parsing file [sed, awk, fortran, bash]
- From: Mark Blakeney <mark.blakeney@bullet-systems.net>
- Re: Please help parsing file [sed, awk, fortran, bash]
- From: Jon Dowland <jmtd@debian.org>