[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

[OT]: stripping html attachments



Hi,

i followed the thread about stripping html attachments and wrote the
following recipe. it seems to work, but let me know if you have any
improvements.

procmail recipe:

---<snip>---
##                                                                                                      
## HTML                                                                                                 
## strip all html off the email                                                                         
##                                                                                                      
:0 HB:                                                                                                  
* ^Content-Type: text/html                                                                              
{                                                                                                       
    :0 bfW:                                                                                             
    | (echo [HTML stripped]; /home/timo/bin/strip_html)                                                 
}           
---<snap>---

strip_html:

---<snip>---
#!/bin/sh                                                                                               
                                                                                                        
TMPFILE=$(mktemp /tmp/strip_html.XXXXXX)
                                                                                                        
echo "<PRE>" > $TMPFILE                                                                                 
                                                                                                        
cat /dev/stdin | \                                                                                      
        sed "s/Content-Type: text\/html/<\/PRE>Content-Type: text\/plain/; \                          
             s/<\/[Hh][Tt][Mm][Ll]>/<\/HTML><PRE>/"                        \                          
        >> $TMPFILE                                                                                     
                                                                                                        
echo "</PRE>" >> $TMPFILE                                                                               
                                                                                                        
w3m -dump -T text/html $TMPFILE                                                                         
                                                                                                        
rm $TMPFILE                                                                                             
---<snap>---                                                                                            
 
-timo

-- 
gpg key fingerprint = 6832 C8EC D823 4059 0CD1  6FBF 9383 7DBD 109E 98DC

Attachment: pgpOqGXFlYU_g.pgp
Description: PGP signature


Reply to: