[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [OT] splitting files based on keyword



On Tue, Jun 21, 2005 at 09:57:07PM -0400, kamaraju kusumanchi wrote:
> I have a large text file (humongous.txt) with the following structure

One more Perl script to add to the mix, below. It's perhaps
paranoid and verbose in its error reporting, but I hope it
helps you to write your own scripts. You run it as

scriptname.pl humongous.txt

Let me know if it's unclear.

#!/usr/bin/perl
use strict;
use warnings;

my @thisFileLines = ();
my $filename = '';
my $thisFilename = '';
my $suffix = '.txt';	#suffix to append to filenames.
while(<>) {
	if( /^date\s+(\d+-\d+-\d+)$/ ) {
		$thisFilename = $1;
		if( @thisFileLines ) {	# are lines from another file waiting to be written?
			if( &saveFile( $filename, @thisFileLines ) ) {	# if so, write them
				# We've now saved the file; empty the buffer and start
				# a new one so that we can move on to this file.
				@thisFileLines = ($_);
			}
			else {
				print STDERR "Failed to save $filename\n";
			}
		}
		else {	# no lines waiting to be written
			push @thisFileLines, $_;
		}
		$filename = $thisFilename . $suffix;
	}
	else {	# Not a line beginning with 'date'
		push @thisFileLines, $_;
	}
}
# Handle any leftover files to be saved -- those that
# aren't followed by a '^date' line.
&saveFile( $filename, @thisFileLines )
	or print STDERR "Failed to save $filename\n";

sub saveFile {
	my $filename = shift;
	if( open( OUTFILE, ">$filename" ) ) {
		print OUTFILE @_;
		if( !close OUTFILE ) {
			return 0;
		}
	}
	else {
		print STDERR "Could not open $filename for writing.\n";
		return 0;
	}
	return 1;
}

-- 
Stephen R. Laniel
steve@laniels.org
+(617) 308-5571
http://laniels.org/
PGP key: http://laniels.org/slaniel.key

Attachment: signature.asc
Description: Digital signature


Reply to: