[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: filtering of text files



Hi Richard,

On Thu, May 06, 1999 at 12:12:37PM +0000, Richard Harran wrote:
> I need to filter some text files to convert all the letters to the same
> case, and to remove punctuation.  I guess I should use 'sed', but the
> man page is disfunct, and I'm struggling with the info.  Could someone
> give me a hint (particularly for the case change thing).

you may want to consider any of the awk-dialects ({n,m,g}awk).
awk's got functions called tolower(<argument>) and toupper(<argument>)
for case changes.

To simply convert everything to lowercase and remove the punctuation 
one could use the following awk command ($0 contains the entire line of
text.) and pipe the result to sed:

awk '{ print tolower($0) }' inputfile | sed -e 's/[.,:;]//g' > outputfile

The sed command replaces all occurences of whatever you place in the
brackets with nothing. You may need to add more "punctuation" to the
set.

So long -- Stephan
-- 
Stephan Engelke                                    engelke@math.uni-hamburg.de
                        *** Life is not fair. But the root password helps. ***


Reply to: