Re: Ideas to obtain text file list of emails in an IMAP folder?
Ron Leach wrote:
> On 03/06/2016 17:31, Dan Purgert wrote:
>> Ron Leach wrote:
>>> Have any debian-user readers ever tried to create a list of all the
>>> email messages stored in an IMAP folder?
>
>> Do the same here (Dovecot + Postfix), not 100% certain if this'll match
>> your setup, but it should be pretty close.
>>
>> [snip]
>>
>> #!/bin/bash
>>
>> [snip most of the the script body]
>>    done )&  spinner $!
>>
>> #############################END#####################
>>
>
> Well.  That worked very well.  Clever, too; I was thinking I'd need to 
> use the imap protocol to extract the headers from Dovecot, in the same 
> way that an email client does.
Nah, IMAP is just a protocol for accessing the messages (similar to how
"HTTP" is the protocol for viewing "documents" over the internet).
Since you have direct access to the box, we can skip all that mess. 
> I've never dabbled in scripts; I'm going to use this to learn about 
> how they work.  
Really, the "scriptiest" parts are just the `if' conditions.  The actual
work is handled by grep.  TBH, you could've done this in a one-liner on
the commandline.
 - cd (maildir) ; for f in *; do egrep 'stuff' $f; done >> output_file.txt
alternately, if you have 'mail' installed:
 - cd (maildir) ; [...] ; done | mail -s"Email Report" you@host
Which would've sent it to you as an email.
> I followed most of it but at the moment I don't 
> understand how the script either
> (a) finds each message (we're using Maildir), nor
When you give it the path to the mail, the program changes directories
to where the mail is saved on the server.  Then the for loop uses a glob
(`*') to enumerate all files in the directory, and puts them into the
"list" identified as 'f'.
For each filename in the list ('f'), the filename that is being checked
is printed out, and then the command 'grep' is called, whose purpose is
to print out lines containing the expression you provide it. In this
case, the expression '^Date:|^To:|^From:|^Subject:' means "Match any
lines that contain 'Date:' OR 'To:' OR [...], and print them to
stdout[1]" 
The caret character -- ^ -- is a special character that means
"the beginning of a line"; meaning a line starting "Date:2016[...]"
would match (and be printed by grep), but one containing "on Date:2016"
would not, because that line does not begin with "Date:".
[1] "stdout" is 'standard output' -- typically, the screen.  There's a
portion of the code wherein I redirect stdout to a file if you've
supplied it, else it would print to the screen (and the spinner would
probably make a mess of things, unless you piped the output to something
else, or did the redirect outside the script).  In addition to "stdout",
you also have:
 - 'stderr' ('standard error'), which also defaults to printing to your
   screen.  It's what I used to get the spinner to work (probably not
   the best option per se, but I just threw this together).
 
 - 'stdin' ('standard input'), i.e. your keyboard. Can be redirected as
   well, though that syntax is "command *<* input", rather than using
   "command *>* output" for redirection to a file, or "cmd | cmd [...]"
   for piping one command's output to the input of the next command.
> (b) how it prints the 5 lines in each report stanza while only using 
> one 'printf' statement.
It's how grep works - it prints all lines that match what you're looking
for.
> But that's what I'm going to find out - tomorrow, now.
>
> Anyway, it's doing exactly what I need.  I commented out the 
> inter-message line of '====', because it annoyed the downstream 
> spreadsheet that I was using to analyse the output.
Figured you were mainly after the list, hence the visual breaks :).
>
> Dan, I'm much obliged,
Glad it helped.  Feel free to hack away at it and edit it as you need.
If you use Usenet, there are some extremely knowledgable guys who tend
to hang out in alt.os.linux[.ubuntu] (along with us up-and-coming types).  
-- 
Registered Linux user #585947
Github: https://github.com/dpurgert
Reply to: