[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Need a shell or perl script



On Sun, Feb 22, 2004 at 10:51:58PM +0530, Deboo wrote:
> Now, I could search for one or two email addresses whenever I need to,
> having kept all these different mailboxes in one directory. Never having
> made an addressbook, is what caused this problem. I would like to sort,
> search and make a list of the email addresses. I know qute many of these
> addresses have become invalid over the years and some friends' addresses
> are more than one or two, but still a list would be nice. I know it can be
> done with shell scripting and better with perl but I know neither.

If they are in a plain text format (mbox, maildir, mh), then you can
simply use grep. Basically you want to get the from line from each
email message. To do that you simply search for lines starting with
"From: ". To do that you tell grep to search for "^From:" where '^'
represents the beginning of a line. Then you want to take the results
and put them in a file so you tell the shell to put the output in a file
by using '>' to redirect the output to a file. This will clear the file if
it already exists, if you want to add to the end of an existing file you
use ">>".

mbox:
grep "^From: " INBOX debian-user some-other-mbox > list-of-emails

maildir:
grep "^From: " INBOX/cur/* debian-user/cur/* some-other-maildir/cur/* > list-of-emails

mh:
grep "^From: " INBOX/* debian-user/* some-other-mh/* > list-of-emails

o  Once you have all the email addresses in a file you want to:
   remove the "From: " from the beginning of each line.

o  Remove duplicate email addresses.

o  Sort the list.

To remove the "From: " you can use sed (a simple filter/editor).
You simply tell script to replace "From: " with nothing whenever it
occurs at the beginning of a line. This can be done with the s (substitution)
command:

sed "s/^From: //" list-of-emails > list-of-emails-without-from

Now you can sort the emails using sort:
sort list-of-emails-without-from > list-of-emails-sorted

Now you can get rid of duplicates using uniq:
uniq list-of-emails-sorted > list-of-emails-sorted-unique

And now you should be on your way to getting what you want.

Bijan
-- 
Bijan Soleymani <bijan@psq.com>
http://www.crasseux.com

Attachment: signature.asc
Description: Digital signature


Reply to: