[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: changing hard returns to soft ones



On Mon, Sep 29, 2003 at 02:09:45PM -0400, Emma Jane Hogbin wrote:

> I'm interested in printing a Gutenberg Project text (it's ok, I'm a
> bookbinder--printing is typical behaviour for me). The problem is the line
> breaks in the .txt files.
> 
> Does anyone know how I could convert single hard returns into a white
> space? It must be some variation of:
> 	
> 	# mac file to Unix file:
> 	tr '\015' '\012' < old.txt > new.txt
> 
> ...but I'm not sure what the octal value (?) is for a hard return.

In a text file, there is no such thing as "hard return"; line endings are
coded with a linefeed character (012).. You could translate all linefeeds
to spaces:

   tr '\012' ' ' < old.txt > new.txt

However, this would also convert legitimate paragraph breaks into
spaces, and the result would be one really big line of text. If you knew
that paragraphs are always separated by a blank line and that there were
no instances of double blanks anywhere else, then you could refilter the
text to turn all double blanks back into linefeeds. 

The following might work for you, if you don't care about losing legitimate
extra spaces from the text:

  sed 's/  */ /g' < old.txt | tr '\012' ' ' | sed 's/   */\n/g' > new.txt

  ^^^^^^^^^^^^^^              ^^^^^^^^^^^^^   ^^^^^^^^^^^^^^^^^
  Change multi-              Convert line-    Convert multi-
  spaces into                feed to space.   space back into
  single spaces.                              linefeeds.

-- 
Dave Carrigan
Seattle, WA, USA
dave@rudedog.org | http://www.rudedog.org/ | ICQ:161669680
UNIX-Apache-Perl-Linux-Firewalls-LDAP-C-C++-DNS-PalmOS-PostgreSQL-MySQL

Dave is currently listening to Blue Aeroplanes - Angelwords (Beatsongs)

Attachment: signature.asc
Description: Digital signature


Reply to: