[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Removing line breaks



On Sun, Mar 31, 2002 at 02:41:31AM +0800, csj wrote:
| What's the fastest way to reformat a mutt-friendly text file into
| something a WYSIWYG word processor would love?
| 
| Let's take this email as an example. When I use the linewrap command in
| Sylpheed, the text of my email is broken into shorter lines. But when I
| cut and paste these selfsame lines into Abiword or KWord, they appear as
| a series of short individual paragraphs (as indicated by the reverse "P"
| sign in AbiWord or the Enter key symbol in KWord).

Yeah, mailers put linefeeds in so that they won't exceed the 1000
character SMTP limit on line length.  (I recently saw a web based
mailer fail to do that, and each "paragraph" that was too long was
truncated)

One of the nice things about LaTeX is that it understands paragraph
breaks.  That is, a newline character doesn't create a new paragraph.
A blank line separates paragraphs.  (just like in emails)

| What I need is a simple tool (a sed script will do find) to purge all
| new lines from a given text file except those immediately preceded by
| blank lines. In such case, the blank line's newline will be retained as
| the paragraph break. (Example above: the emptiness between "in KWord)."
| and "What I need")
 
I always have trouble mucking with linefeed with line-based tools like
sed.

#!/usr/bin/env python

import sys
p = []
for line in sys.xreadlines() :
    l = line.strip() # strip all leading and trailing whitespace
    if l :
        p.append( l )
    else :
        print " ".join( p )
        # uncomment the next line if you want a blank line in between paragraphs.
        #print
        p = []


Untested, but I think it works :-).

-D

-- 

"Don't use C;  In my opinion,  C is a library programming language
 not an app programming language."  - Owen Taylor (GTK+ developer)


-- 
To UNSUBSCRIBE, email to debian-user-request@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org



Reply to: