[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: procmail rule for html mail



On Sun, Mar 03, 2002 at 04:00:13AM -0600, Rob VanFleet wrote:
> It seems like this has come up before, but I couldn't turn anything up
> from searching.  Basically, I am looking for a procmail rule that will
> detect html mail, and pipe it to a script to strip the tags from it,
> preferably before the other rules are applied, so it still ends up in
> the proper mailbox, but that is not a necessity.
I use the following script in conclusing with a procmail rule.
It seems to work.

If anyone knows of a program that does the same like the debian mimedecode
please drop me an email.

---<snip>---
#!/bin/sh                                                                                               
##                                                                                                      
##                  strip_html 0.1.2                                                                    
##                                                                                                      
## takes an email message from stdin and strips all html off.                                           
## the resulting message is printed to stdout.                                                          
##                                                                                                      
## 2002 by Timo Benk <t_benk@users.sourceforge.net>                                                     
##                                                                                                      
##                                                                                                      
                                                                                                        
FILES="mktemp w3m mimedecode sed"                                                                       
                                                                                                        
# Lets look if everything we need is in the path                                                        
for i in $FILES; do                                                                                     
    if ! which $i >> /dev/null; then                                                                    
        echo "strip_html: sorry, can't find $i !"                                                       
        exit 1                                                                                          
    fi                                                                                                  
done                                                                                                    
                                                                                                        
TMPFILE=$(mktemp /tmp/strip_html.XXXXXX)                                                                
TOKEN="LISTING"                                                                                         
                                                                                                        
echo "<$TOKEN>" > $TMPFILE                                                                              
                                                                                                        
cat /dev/stdin | mimedecode |                                             \                             
        sed "s/Content-Type: text\/html/Content-Type: text\/plain/;       \                             
             s/<![Dd][Oo][Cc][Tt][Yy][Pp][Ee][^>]*>/<\/$TOKEN>&<$TOKEN>/; \                             
             s/<[Hh][Tt][Mm][Ll]>/[HTML stripped]<\/$TOKEN><BR>&/;        \                             
             s/<\/[Hh][Tt][Mm][Ll]>/&<$TOKEN>/"                           \                             
        >> $TMPFILE                                                                                     
                                                                                                        
echo "</$TOKEN>" >> $TMPFILE                                                                            
                                                                                                        
w3m -F -cols 80 -dump -T text/html $TMPFILE                                                             
                                                                                                        
rm $TMPFILE                                                                                             
---<snap>---

---<snip>---
##                                                                                                      
## HTML                                                                                                 
## strip all html off the email                                                                         
##                                                                                                      
:0 HB:                                                                                                  
* ^Content-Type: text/html                                                                              
{                                                                                                       
    :0 c                                                                                                
    unstripped.backup                                                                                   
                                                                                                        
    :0 hbfW                                                                                             
    | /home/timo/bin/strip_html                                                                         
}                                                                                                       
---<snap>---

-timo

--
gpg key fingerprint =3D 6832 C8EC D823 4059 0CD1  6FBF 9383 7DBD 109E 98DC


--mYCpIKhGyMATD0i+
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: Weitere Infos: siehe http://www.gnupg.org

iD8DBQE8g4JWk4N9vRCemNwRAkkZAJ9Qt+P3a1loag1SWG/kjuSMplB7owCeIkjL
sU/fNkMJOUWp+BICfN2CTKk=
=ltby
-----END PGP SIGNATURE-----

--mYCpIKhGyMATD0i+--

Attachment: pgpO67yiFOyu_.pgp
Description: PGP signature


Reply to: