[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#754745: [tracker.debian.org] New daemon to handle mails through a maildir



Package: tracker.debian.org
Severity: wishlist

Hi,

Please find attached the 3 commit patches proposed to solve the
following Trello task:
-----
The current design involves forking a new process
(./manage.pytracker_dispatch) for each incoming email. This is
problematic on multiple levels:

we ran out of memory on the test machine when 200 mails were delivered
in the same second... (and exim can't be configured to rate-limit
this)
mails can get lost/bounced back if the process fails for some reason

So we want to change this so that mails are delivered to a local
Maildir and we have a daemon watching this directory (possibly with
inotify so that we have no delay) and processing mails with a
configurable number of workers.
-----

These commits can also be found or tested on branch "maildir_daemon"
of the git.domainepublic.net/distro-tracker repo.

Should you have any remarks, please tell me.

Best,
Joseph
From eef1c8832ef9978bafc6d0de266ba65f61405283 Mon Sep 17 00:00:00 2001
From: Joseph Herlant <herlantj@gmail.com>
Date: Mon, 14 Jul 2014 01:10:58 +0200
Subject: [PATCH 1/3] Class that provides base functions for using a Maildir

This class provides all the functions to help replacing tracker_control,
tracker_recieve_news and tracker_dispatch piping system by a unified system
that uses a Maildir. That will be more generic and could be used as a processing
queue to avoid out of memory issues generated using the tests with the piping
mechanism.

Some new settings have been added for this needs:
 - DISTRO_TRACKER_MAILDIR_PATH that contains the path of the maildir where
   mails
   are recieved.
 - DISTRO_TRACKER_NEWSMAIL_SUFFIXES that will define the list of the
   suffixes
   that any news local_part could end with.
---
 data/.gitignore                                    |   1 +
 distro_tracker/mail/maildir_manager.py             | 205 ++++++++++
 distro_tracker/mail/tests/tests_maildir_manager.py | 451 +++++++++++++++++++++
 distro_tracker/project/settings/defaults.py        |   9 +
 4 files changed, 666 insertions(+)
 create mode 100644 distro_tracker/mail/maildir_manager.py
 create mode 100644 distro_tracker/mail/tests/tests_maildir_manager.py

diff --git a/data/.gitignore b/data/.gitignore
index 1cc33c7..36e933c 100644
--- a/data/.gitignore
+++ b/data/.gitignore
@@ -1 +1,2 @@
 distro-tracker.sqlite
+Maildir
diff --git a/distro_tracker/mail/maildir_manager.py b/distro_tracker/mail/maildir_manager.py
new file mode 100644
index 0000000..a3fec28
--- /dev/null
+++ b/distro_tracker/mail/maildir_manager.py
@@ -0,0 +1,205 @@
+# Copyright 2014 The Distro Tracker Developers
+# See the COPYRIGHT file at the top-level directory of this distribution and
+# at http://deb.li/DTAuthors
+#
+# This file is part of Distro Tracker. It is subject to the license terms
+# in the LICENSE file found in the top-level directory of this
+# distribution and at http://deb.li/DTLicense. No part of Distro Tracker,
+# including this file, may be copied, modified, propagated, or distributed
+# except according to the terms contained in the LICENSE file.
+"""
+Implements the processing of retrieving packages from the maildir and proceed
+to the mail processing.
+"""
+from __future__ import unicode_literals
+from django.conf import settings
+
+from distro_tracker.mail.control import process as control_process
+from distro_tracker.mail.dispatch import process as dispatch_process
+from distro_tracker.mail.mail_news import process as news_process
+
+import logging
+from mailbox import Maildir
+from email.message import Message
+from email.utils import parseaddr
+import os
+from rfc822 import Message as rfc822_Message
+
+DISTRO_TRACKER_CONTROL_EMAIL = settings.DISTRO_TRACKER_CONTROL_EMAIL
+DISTRO_TRACKER_MAILDIR_PATH = settings.DISTRO_TRACKER_MAILDIR_PATH
+DISTRO_TRACKER_NEWSMAIL_SUFFIXES = settings.DISTRO_TRACKER_NEWSMAIL_SUFFIXES
+logger = logging.getLogger(__name__)
+
+class MaildirManager(object):
+
+    def __init__(self):
+        """
+        Setting the maildir object first.
+        """
+        self.maildir = Maildir(DISTRO_TRACKER_MAILDIR_PATH, factory=None)
+        self.reset_message()
+        self.msg_filename = None
+
+    def get_emails_from_header(self, header):
+        """
+        This helper returns the list of recipients from a given header.
+        The header matching will be case insensitive.
+        """
+        recipients = []
+        # We lower the keys of the headers to ensure the case insensitivity
+        lower_headers = {key.lower():key for key in self.message.keys()}
+        if header.lower() in lower_headers.keys():
+            # Here we get the real header label back
+            header_label = lower_headers[header.lower()]
+            recipients = [parseaddr(item)[1]
+                    for item in self.message[header_label].split(',')]
+        return recipients
+
+    def get_recipients(self):
+        """
+        This method gets the recipient from the current message. It first look
+        at the 'Envelope-to', 'X-Envelope-To' and 'X-Original-To' headers (in
+        this order).
+        If one of those is defined, it returns its content as a list.
+        If none of those are defined, it returns the list of mail adresses from
+        the To, Cc and Bcc headers.
+        """
+        recipients = self.get_emails_from_header('Envelope-to')
+        recipients.extend(self.get_emails_from_header('X-Envelope-To'))
+        recipients.extend(self.get_emails_from_header('X-Original-To'))
+
+        # Getting 'To', 'Cc', and 'Bcc' only if the previous fields are empty
+        if recipients == []:
+            recipients = self.get_emails_from_header('To')
+            recipients.extend(self.get_emails_from_header('Cc'))
+            recipients.extend(self.get_emails_from_header('Bcc'))
+
+        return recipients
+
+    def is_news_recipient(self, recipient):
+        """
+        This method tests if a recipient has the suffix of a news email
+        adress using DISTRO_TRACKER_NEWSMAIL_SUFFIXES.
+        Returns True if so, False if not.
+        """
+        local_part = recipient.rsplit('@', 1)[0]
+        for suffix in DISTRO_TRACKER_NEWSMAIL_SUFFIXES:
+            if local_part.endswith(suffix.lower()):
+                return True
+        return False
+
+    def get_mail_key_from_filename(self, filename, folder='new'):
+        """
+        Returns the key of the mail if exists from given path.
+        """
+        mail_key = None
+        self.maildir._refresh()
+        for k in self.maildir._toc:
+            if self.maildir._toc[k] == "{0}/{1}".format(folder,filename):
+                mail_key = k
+                break
+        return mail_key
+
+
+    def retrieve_mail_content(self, mail_file_name=None, mail_key=None):
+        """
+        Returns the message corresponding to the given id from the maildir
+        object or None if no message exists with this id.
+        
+        It ensures that if a message is retrieved, it will be as an
+        ``email.message.Message`` class instance.
+
+        Returns `None` if the file name is not found.
+        """
+        # First we need to work wit the mail key if not provided
+        if mail_key is None and mail_file_name is not None:
+            mail_key = self.get_mail_key_from_filename(mail_file_name)
+        
+        self.message = self.maildir.get(mail_key, default=None)
+
+        if self.message is None:
+            logger.debug("Unable to find mail file {0}".format(mail_file_name))
+            self.message = None
+        else:
+            self.msg_filename = mail_file_name
+
+    def reset_message(self):
+        """
+        Resets the self.message object.
+        """
+        self.message = Message()
+        self.msg_filename = None
+
+    def delete_mail(self, filename=None, mail_key=None):
+        """
+        Deletes a mail if exists from the filename which is the message id
+        generated by the make_msgid method.
+        And resets the current self.message if its id matches.
+        If no message_id is provided, the current self.message's id is taken.
+
+        This class support `self.message` to be either `email.message.Message`
+        class instance or an `rfc822.Message` class instance.
+        """
+        if mail_key is None:
+            if isinstance(self.message, rfc822_Message):
+                self.msg_filename = os.path.basename(self.message.fp._file.name)
+            else:
+                if self.message.get_filename() is not None:
+                    self.msg_filename = self.message.get_filename()
+
+            # If no parameter is set, trying to set it.
+            if filename is None:
+                filename = self.msg_filename
+
+            # We need to work wit the mail key
+            mail_key = self.get_mail_key_from_filename(filename)
+
+        if mail_key is None:
+            logger.debug("Unable to find mail file {0} for deletion".format(filename))
+        else:
+            self.maildir.discard(mail_key)
+
+        # If given filename was given empty or was the current self.message id,
+        # reset the self.message
+        if self.msg_filename is not None and self.msg_filename == filename:
+            self.reset_message()
+
+    def process_mail_error(self, exception, mail_file_name=None, mail_key=None):
+        logger.error(
+                "Exception occured while trying to process {0} ({1}): {2}".format(
+                    mail_file_name, mail_key, exception)
+                )
+
+    def process_mail(self, mail_file_name=None, mail_key=None):
+        """
+        First gets the mail in the Maildir from the given filename.
+        the sends the mail to the corresponding process method.
+        """
+        if mail_key is None:
+            self.retrieve_mail_content(mail_file_name=mail_file_name)
+        else:
+            self.retrieve_mail_content(mail_key=mail_key)
+        if self.message is None:
+            return
+
+        recipients = self.get_recipients()
+        flat_mail = self.message.__str__()
+        try:
+            for recipient in recipients:
+                recipient = recipient.lower()
+                if recipient ==  DISTRO_TRACKER_CONTROL_EMAIL.lower():
+                    # Processes the given mail through the control process
+                    control_process(bytes(flat_mail))
+                elif self.is_news_recipient(recipient):
+                    # Processes the given mail through the mail_news process
+                    news_process(bytes(flat_mail))
+                else:
+                    # Processes the given mail through the dispatch process
+                    dispatch_process(bytes(flat_mail), recipient)
+            # Deleting message when processed correctly
+            self.delete_mail(filename=mail_file_name, mail_key=mail_key)
+        except Exception as ex:
+            # Whenever an exception occurs in the mail processing, archive the
+            # mail to a specific folder.
+            self.process_mail_error(ex, mail_file_name, mail_key)
+
diff --git a/distro_tracker/mail/tests/tests_maildir_manager.py b/distro_tracker/mail/tests/tests_maildir_manager.py
new file mode 100644
index 0000000..e3bddad
--- /dev/null
+++ b/distro_tracker/mail/tests/tests_maildir_manager.py
@@ -0,0 +1,451 @@
+# -*- coding: utf-8 -*-
+
+# Copyright 2014 The Distro Tracker Developers
+# See the COPYRIGHT file at the top-level directory of this distribution and
+# at http://deb.li/DTAuthors
+#
+# This file is part of Distro Tracker. It is subject to the license terms
+# in the LICENSE file found in the top-level directory of this
+# distribution and at http://deb.li/DTLicense. No part of Distro Tracker,
+# including this file, may be copied, modified, propagated, or distributed
+# except according to the terms contained in the LICENSE file.
+"""
+Tests for :mod:`distro_tracker.mail.tracker_maildir_manager`.
+"""
+
+from __future__ import unicode_literals
+from django.conf import settings
+from django.core import mail
+from django.test import TestCase
+from django.utils import timezone
+from django.utils.six.moves import mock
+
+from distro_tracker.mail.maildir_manager import MaildirManager
+
+from distro_tracker.core.models import SourcePackageName
+from distro_tracker.core.models import SourcePackage
+from distro_tracker.core.models import News
+from distro_tracker.core.models import Subscription
+from distro_tracker.core.utils import verp
+from distro_tracker.mail.models import UserEmailBounceStats
+
+import mailbox
+from email.message import Message
+from email.utils import make_msgid
+
+DISTRO_TRACKER_CONTROL_EMAIL = settings.DISTRO_TRACKER_CONTROL_EMAIL
+DISTRO_TRACKER_CONTACT_EMAIL = settings.DISTRO_TRACKER_CONTACT_EMAIL
+DISTRO_TRACKER_FQDN = settings.DISTRO_TRACKER_FQDN
+DISTRO_TRACKER_MAILDIR_PATH = settings.DISTRO_TRACKER_MAILDIR_PATH
+
+class MaildirManagerTest(TestCase):
+
+    def setUp(self):
+        """
+        In the setup we set some default values.
+        """
+        self.maildir = mailbox.Maildir(DISTRO_TRACKER_MAILDIR_PATH)
+        self.original_mail_count = self.maildir.__len__()
+        # Initializing a new dummy package
+        self.package_name = SourcePackageName.objects.create(
+                name='dummy-package')
+        self.package = SourcePackage.objects.create(
+                source_package_name=self.package_name,
+                version='1.0.0')
+        # Setting message default header
+        self.message = Message()
+        self.set_default_headers()
+        # Initializing an instance of the MaildirManager
+        self.manager = MaildirManager()
+        # This array stores the id of the messages created during the tests
+        self.generated_mail_ids = []
+
+    def tearDown(self):
+        """
+        Discarding the messages created during the tests.
+        """
+        for msgid in self.generated_mail_ids:
+            self.manager.delete_mail(mail_key=msgid)
+
+    def set_default_headers(self):
+        """
+        Helper method which adds the default headers for each test message.
+        """
+        self.set_header('From', 'John Doe <john.doe@unknown.com>')
+        self.set_header('To',
+            '{package}@{distro_tracker_fqdn}'.format(
+                package=self.package_name,
+                distro_tracker_fqdn=DISTRO_TRACKER_FQDN
+                )
+            )
+        self.set_header('Subject', 'Commands')
+        self.set_header('Message-ID', make_msgid())
+
+    def set_header(self, header_name, header_value):
+        """
+        Sets a header of the test message to the given value.
+        If the header previously existed in the message, it is overwritten.
+
+        :param header_name: The name of the header to be set
+        :param header_value: The new value of the header to be set.
+        """
+        if header_name in self.message:
+            del self.message[header_name]
+        self.message.add_header(header_name, header_value)
+
+    def add_email_to_maildir(self, body, headers={}, encoding='ASCII'):
+        """
+        This helper adds the given mail message to the configured maildir
+        without using smtp or send any real mail
+        """
+        self.message.multipart = False
+        for header_name in headers.keys():
+            self.set_header(header_name, headers[header_name])
+        self.message.set_payload(body)
+        self.message.set_charset(encoding)
+        msgid = self.maildir.add(self.message)
+        self.generated_mail_ids.append(msgid)
+        return msgid
+
+    def assert_header_equal(self, header_name, header_value,
+                            response_number=-1):
+        """
+        Helper method which asserts that a particular response's
+        header value is equal to an expected value.
+
+        :param header_name: The name of the header to be tested
+        :param header_value: The expected value of the header
+        :param response_number: The index number of the response message.
+            Standard Python indexing applies, which means that -1 means the
+            last sent message.
+        """
+        out_mail = mail.outbox[response_number].message()
+        self.assertEqual(out_mail[header_name], header_value)
+
+    def test_adding_email_to_maildir(self):
+        """
+        Testing the add_email_to_maildir helper method.
+        """
+        msgid = self.add_email_to_maildir(
+                body='We do not care about the body content',
+                headers={
+                    'Subject':'Mail from test_adding_email_to_maildir method',
+                    },
+                )
+        final_mail_count = self.maildir.__len__()
+
+        self.assertEqual(final_mail_count, self.original_mail_count + 1)
+
+    def test_delete_mail_noreset_message(self):
+        """
+        Testing to delete a mail without setting the manager message object.
+        Just passing the message id to the method.
+        """
+        msgid = self.add_email_to_maildir('Some mail content')
+        final_mail_count = self.maildir.__len__()
+        self.assertEqual(final_mail_count, self.original_mail_count + 1)
+
+        self.manager.delete_mail(mail_key=msgid)
+        final_mail_count = self.maildir.__len__()
+        self.assertEqual(final_mail_count, self.original_mail_count)
+
+    def test_delete_mail_reset_message(self):
+        """
+        Testing to delete a mail after setting the message object.
+        """
+        msgid = self.add_email_to_maildir('Some mail content')
+        self.manager.message = self.maildir.get(msgid)
+        intermediate_mail_count = self.maildir.__len__()
+        self.assertEqual(intermediate_mail_count, self.original_mail_count + 1)
+
+        self.manager.delete_mail()
+        # Checking the message has been discarded
+        final_mail_count = self.maildir.__len__()
+        self.assertEqual(final_mail_count, self.original_mail_count)
+        # Checking that self.message has been reset.
+        self.assertIsInstance(self.manager.message, Message)
+        self.assertIsNone(self.manager.message.get_filename())
+
+    def test_reset_message(self):
+        """
+        Tests that the `reset_message()` reset the message object to a blank
+        new one.
+        """
+        self.manager.message.add_header('Subject', 'Commands')
+        self.assertEqual(self.manager.message.get('Subject'), 'Commands')
+
+        self.manager.reset_message()
+        self.assertIsInstance(self.manager.message, Message)
+        self.assertIsNone(self.manager.message.get('Subject'))
+
+    def test_retrieve_mail_content(self):
+        """
+        Tests that the retrieve_mail_content method sets correctly the
+        message object if a proper file is given.
+        """
+        msgid = self.add_email_to_maildir('Some mail content')
+        self.manager.retrieve_mail_content(mail_key=msgid)
+
+        self.assertIsInstance(self.manager.message, Message)
+        self.assertEqual(self.manager.message.get_payload(), 'Some mail content')
+
+    def test_retrieve_mail_content_not_exists(self):
+        """
+        Tests that the retrieve_mail_content method sets the message object
+        to `None` if an incorrect file name is given.
+        """
+        msgid = make_msgid()
+        self.manager.retrieve_mail_content(mail_key=msgid)
+
+        self.assertIsNone(self.manager.message)
+
+    @mock.patch('distro_tracker.mail.maildir_manager.logger.error')
+    def test_process_mail_error(self, mocked_method):
+        """
+        Checks that the `process_mail_error` method calls the logger.error.
+        """
+        self.manager.process_mail_error(Exception('My exception'), 'my_file')
+        self.assertTrue(mocked_method.called)
+
+    def test_get_emails_from_header(self):
+        """
+        This method tests that the get_emails_from_header method returns
+        the correct array.
+        """
+        recipients = ['Pkg 1 <package-1@domain.com>', 'package-2@domain.com']
+        self.set_header('DummyHEADER', ', '.join(recipients))
+        self.manager.message = self.message
+        ret = self.manager.get_emails_from_header('dummyHeAder')
+        self.assertEqual(ret, ['package-1@domain.com', 'package-2@domain.com'])
+
+    def _test_get_recipients_generic(self, header):
+        """
+        This method will test that adding a given header will return the
+        correct list of recipients using the get_recipients method.
+        """
+        test_mail = 'test_{0}@unknown.com'.format(header)
+        self.set_header(header, test_mail)
+        self.manager.message = self.message
+        returned_recipients = self.manager.get_recipients()
+        self.assertEqual(returned_recipients, [test_mail])
+
+    def test_get_recipients_envelope_to(self):
+        """
+        This method tests that the get_recipients method returns the
+        content of the Envelope-to field if present.
+        """
+        self._test_get_recipients_generic('Envelope-to')
+
+    def test_get_recipients_x_envelope_to(self):
+        """
+        This method tests that the get_recipients method returns the
+        content of the X-Envelope-To field if present.
+        """
+        self._test_get_recipients_generic('X-Envelope-to')
+
+    def test_get_recipients_x_original_to(self):
+        """
+        This method tests that the get_recipients method returns the
+        content of the X-Original-To field if present.
+        """
+        self._test_get_recipients_generic('X-Original-To')
+
+    def test_get_recipients_to_cc_bcc(self):
+        """
+        This method tests that the get_recipients method returns the
+        correct array if the 'Envelope-to', 'X-Envelope-To' and 'X-Original-To'
+        headers are not defined.
+        """
+        to_header = 'to@unknown.com'
+        cc_header = 'cc@unknown.com'
+        bcc_header = 'bcc@unknown.com'
+        self.set_header('To', to_header)
+        self.set_header('Cc', cc_header)
+        self.set_header('Bcc', bcc_header)
+        self.manager.message = self.message
+        returned_recipients = self.manager.get_recipients()
+
+        self.assertEqual(returned_recipients,
+                [to_header, cc_header, bcc_header])
+
+    def test_is_news_recipient(self):
+        is_true = self.manager.is_news_recipient('package_news@unknown.com')
+        is_false = self.manager.is_news_recipient('package_oldies@unknown.com')
+
+        self.assertTrue(is_true)
+        self.assertFalse(is_false)
+
+    @mock.patch('distro_tracker.mail.maildir_manager.MaildirManager.process_mail_error')
+    def test_process_mail_simple_control_command(self, mocked_method):
+        """
+        This method tests that a simple mail coming in the control maildir
+        is processed through the control engine.
+        It checks that the mail has not been processed through the
+        process_mail_error method and has been deleted after process.
+        """
+        msg_id = self.add_email_to_maildir(
+                body='#command\n thanks',
+                headers={'To':DISTRO_TRACKER_CONTROL_EMAIL,},
+                )
+
+        self.assertEqual(self.maildir.__len__(), self.original_mail_count + 1)
+        self.manager.process_mail(mail_key=msg_id)
+
+        self.assertFalse(mocked_method.called)
+        self.assertEqual(self.maildir.__len__(), self.original_mail_count)
+        self.assertEqual(len(mail.outbox), 1)
+        self.assert_header_equal('Subject', 'Re: Commands')
+        self.assert_header_equal('X-Loop', DISTRO_TRACKER_CONTROL_EMAIL)
+        self.assert_header_equal('To', 'John Doe <john.doe@unknown.com>')
+        self.assert_header_equal('From', DISTRO_TRACKER_CONTACT_EMAIL)
+
+
+    def test_process_mail_simple_news_command(self):
+        """
+        This method tests that a simple mail coming in the contact maildir
+        is processed through the news mail engine.
+        """
+        subject = 'Some message'
+        content = 'Some message content'
+        msg_id = self.add_email_to_maildir(
+            body=content,
+            headers={
+                'Subject':subject,
+                'To':'dummy-package_news@' + DISTRO_TRACKER_FQDN,
+                'X-Distro-Tracker-Package':self.package.name,
+                }
+            )
+
+
+        self.manager.process_mail(mail_key=msg_id)
+
+        # A news item is created
+        self.assertEqual(1, News.objects.count())
+        news = News.objects.all()[0]
+        # The title of the news is set correctly.
+        self.assertEqual(subject, news.title)
+        self.assertIn(content, news.content)
+        # The content type is set to render email messages
+        self.assertEqual(news.content_type, 'message/rfc822')
+
+    def test_process_mail_simple_package_command(self):
+        """
+        This method tests that a simple mail coming in the package maildir
+        is processed through the dispatch engine.
+        It also checks that processing utf-8 content is supported.
+        """
+        user_email = 'John Doe <john.doe@unknown.com>'
+        # Subscribing user to package
+        Subscription.objects.create_for(
+            package_name=self.package.name,
+            email=user_email,
+            active=True)
+        # Sending news mail
+        msg_id = self.add_email_to_maildir(
+            body='üößšđžčć한글',
+            headers={
+                'Subject':'Some subject',
+                'From':user_email,
+                'X-Distro-Tracker-Approved':'1',
+                },
+            encoding='utf-8')
+
+        # Processing mail
+        self.manager.process_mail(mail_key=msg_id)
+
+        msg = mail.outbox[0]
+        # No exception thrown trying to get the entire message's content as bytes
+        content = msg.message().as_string()
+        # The content is actually bytes
+        self.assertTrue(isinstance(content, bytes))
+        # Checks that the message is correctly forwarded to subscriber
+        self.assertIn(user_email, (message.to[0] for message in mail.outbox))
+
+    def test_process_mail_bounce_recorded(self):
+        """
+        Tests that a received bounce is recorded.
+        """
+        bounce_address = 'bounces+{date}@{distro_tracker_fqdn}'.format(
+                date=timezone.now().date().strftime('%Y%m%d'),
+                distro_tracker_fqdn=DISTRO_TRACKER_FQDN)
+        dest='user@domain.com'
+
+        Subscription.objects.create_for(
+                package_name='dummy-package',
+                email=dest,
+                active=True)
+        # self.user = EmailUserBounceStats.objects.get(user_email__email=dest)
+        self.user = UserEmailBounceStats.objects.get(email=dest)
+        msg_id = self.add_email_to_maildir(
+                body="Don't care",
+                headers={
+                    'Subject':'bounce',
+                    'Envelope-to':verp.encode(bounce_address, self.user.email)
+                    },
+                )
+
+        # Make sure the user has no prior bounce stats
+        self.assertEqual(self.user.bouncestats_set.count(), 0)
+        self.manager.process_mail(mail_key=msg_id)
+
+        bounce_stats = self.user.bouncestats_set.all()
+        self.assertEqual(bounce_stats.count(), 1)
+        self.assertEqual(bounce_stats[0].date, timezone.now().date())
+        self.assertEqual(bounce_stats[0].mails_bounced, 1)
+        self.assertEqual(self.user.emailsettings.subscription_set.count(), 1)
+
+
+    @mock.patch('distro_tracker.mail.maildir_manager.control_process')
+    def _test_process_mail_control_for_case(self, test_email, mocked_method):
+        """
+        Tests that the process_mails method calls the control process whatever
+        the case is given.
+        """
+        msg_id = self.add_email_to_maildir(
+                body='#command\n thanks',
+                headers={'To':test_email,},
+                )
+
+        self.manager.process_mail(mail_key=msg_id)
+
+        self.assertTrue(mocked_method.called)
+
+    def test_process_control_mail_lowercase(self):
+        """
+        Tests that the process_mails method calls the control process when
+        control email is given in lower case.
+        """
+        self._test_process_mail_control_for_case(
+                DISTRO_TRACKER_CONTROL_EMAIL.lower())
+
+    def test_process_control_mail_uppercase(self):
+        """
+        Tests that the process_mails method calls the control process when
+        control email is given in upper case.
+        """
+        self._test_process_mail_control_for_case(
+                DISTRO_TRACKER_CONTROL_EMAIL.upper())
+
+    @mock.patch('distro_tracker.mail.maildir_manager.news_process')
+    def test_case_for_process_mail_news(self, mocked_method):
+        """
+        Tests that a mail from a non standard case for news is still processed
+        through the news mail engine.
+        """
+        subject = 'Some message'
+        content = 'Some message content'
+        suffix = settings.DISTRO_TRACKER_NEWSMAIL_SUFFIXES[0]
+        msg_id = self.add_email_to_maildir(
+            body=content,
+            headers={
+                'Subject':subject,
+                'To':'dummy-package{suffix}@{fqdn}'.format(suffix=suffix.title(),
+                    fqdn=DISTRO_TRACKER_FQDN),
+                'X-Distro-Tracker-Package':self.package.name,
+                }
+            )
+
+
+        self.manager.process_mail(mail_key=msg_id)
+
+        self.assertTrue(mocked_method.called)
diff --git a/distro_tracker/project/settings/defaults.py b/distro_tracker/project/settings/defaults.py
index 1136414..d056055 100644
--- a/distro_tracker/project/settings/defaults.py
+++ b/distro_tracker/project/settings/defaults.py
@@ -366,6 +366,15 @@ DISTRO_TRACKER_MAX_ALLOWED_ERRORS_CONTROL_COMMANDS = 5
 #: The number of days a command confirmation key should be valid.
 DISTRO_TRACKER_CONFIRMATION_EXPIRATION_DAYS = 3
 
+#: The maildir where the all the mails are received (control server mails,
+#: package news mails, bounces mails, and other package-related mails)
+DISTRO_TRACKER_MAILDIR_PATH = os.path.join(DISTRO_TRACKER_BASE_PATH, 'data', 'Maildir')
+#: Email adress possible suffixes that will make a mail to be processed as a
+#: news mail when using the tracker_maildir_manager managment command
+DISTRO_TRACKER_NEWSMAIL_SUFFIXES = ('_news',)
+#: The maximum number of process to spawn for fetching mails from the maildir
+DISTRO_TRACKER_MAILDIR_MAX_WORKERS = 10
+
 #: The maximum number of news to include in the news panel of a package page
 DISTRO_TRACKER_NEWS_PANEL_LIMIT = 30
 
-- 
2.0.0

From 77765dea0fe08521c4073254493c5b431fbb7cf4 Mon Sep 17 00:00:00 2001
From: Joseph Herlant <herlantj@gmail.com>
Date: Mon, 14 Jul 2014 01:12:35 +0200
Subject: [PATCH 2/3] Adding a daemon as management command to watch a Maildir

This new management command is able to track the given Maildir for new mails
and asynchronously launch the process of the mail through a multiprocessing
pool.

This pool of processes is limited by a configured number to avoid the potential
out of memory issues seen when recieving a lot of mails with the
tracker_dispatch, tracker_control and tracker_recieve_news management commands.
---
 .../management/commands/tracker_maildir_watcher.py | 110 +++++++++++++++++++++
 .../mail/tests/tests_maildir_management_command.py |  66 +++++++++++++
 2 files changed, 176 insertions(+)
 create mode 100644 distro_tracker/mail/management/commands/tracker_maildir_watcher.py
 create mode 100644 distro_tracker/mail/tests/tests_maildir_management_command.py

diff --git a/distro_tracker/mail/management/commands/tracker_maildir_watcher.py b/distro_tracker/mail/management/commands/tracker_maildir_watcher.py
new file mode 100644
index 0000000..723d39c
--- /dev/null
+++ b/distro_tracker/mail/management/commands/tracker_maildir_watcher.py
@@ -0,0 +1,110 @@
+# Copyright 2014 The Distro Tracker Developers
+# See the COPYRIGHT file at the top-level directory of this distribution and
+# at http://deb.li/DTAuthors
+#
+# This file is part of Distro Tracker. It is subject to the license terms
+# in the LICENSE file found in the top-level directory of this
+# distribution and at http://deb.li/DTLicense. No part of Distro Tracker,
+# including this file, may be copied, modified, propagated, or distributed
+# except according to the terms contained in the LICENSE file.
+"""
+Implements the management command which will watch the given maildir for
+incoming mails and spawn processes that will handle the new mails.
+This command will be used as a daemon which will manage the mails received in
+the DISTRO_TRACKER_MAILDIR_PATH destination.
+
+This process will take care of checking that the number of workers for each
+maildir is less than the configured values.
+"""
+from django.conf import settings
+from django.core.management.base import BaseCommand
+
+from distro_tracker.mail.maildir_manager import MaildirManager
+
+import logging
+from mailbox import Maildir
+from  multiprocessing import Pool
+import os
+import pyinotify
+import signal
+import sys
+
+logger = logging.getLogger(__name__)
+
+DISTRO_TRACKER_BASE_PATH = settings.DISTRO_TRACKER_BASE_PATH
+DISTRO_TRACKER_MAILDIR_PATH = settings.DISTRO_TRACKER_MAILDIR_PATH
+MAILDIR_MAX_WORKERS = settings.DISTRO_TRACKER_MAILDIR_MAX_WORKERS
+
+class Command(BaseCommand):
+    """
+    A Django management command used to invoke the maildir manager daemon.
+    """
+
+    def handle(self, *args, **kwargs):
+        logger.info('Starting maildir watcher daemon')
+        # Initializing and instance of the class that will handle the
+        # management of the workers
+        handler = WorkersManager()
+        for sig in [signal.SIGTERM, signal.SIGINT, signal.SIGQUIT]:
+            signal.signal(sig, handler.handle_sigterm)
+ 
+        logger.info('Processing existing mails')
+        mdir_path_new = os.path.join(DISTRO_TRACKER_MAILDIR_PATH,'new')
+        mdir = Maildir(DISTRO_TRACKER_MAILDIR_PATH)
+        for f in os.listdir(mdir_path_new):
+            # Don't manage subdirectories
+            if os.path.isfile(os.path.join(mdir_path_new,f)):
+                handler.feed_worker(f)
+
+        logger.info('Launching Maildir watcher')
+        wm = pyinotify.WatchManager()
+        notifier = pyinotify.Notifier(wm, default_proc_fun=handler)
+        mask = pyinotify.IN_MOVED_TO | pyinotify.IN_CREATE
+        wm.add_watch(mdir_path_new, mask)
+        notifier.loop()
+
+
+class WorkersManager(pyinotify.ProcessEvent):
+    """
+    This class will manage the behavior of the daemon when detecting a new mail
+    arrived in the watched Maildir.
+    """
+
+    def my_init(self):
+        """
+        Standard constructor addon for pyinotify ProcessEvent class.
+        """
+        # Initializing a mutiprocessing Pool of workers
+        self.workers = Pool(processes=MAILDIR_MAX_WORKERS)
+
+    def handle_sigterm(self, signum = None, frame = None):
+        """
+        Method that manages the closing of the queues while catching SIG*.
+        """
+        logger.info("Closing process pool due to {0} signal.".format(signum))
+        self.workers.close()
+        self.workers.join()
+        sys.exit(0)
+
+    def process_default(self, event):
+        """
+        Trigger function for the default events in pyinotify. Used here to
+        handle both IN_MOVED_TO and IN_CREATE events.
+        """
+        logger.debug("Notification for {0} for {1}".format(
+            event.maskname, event.name))
+        self.feed_worker(event.pathname)
+
+    def feed_worker(self, file_name):
+        """
+        Asks the process pool to process the given file asynchronously.
+        """
+        self.workers.apply_async(worker_main, [file_name])
+
+def worker_main(file_name):
+    """
+    Function used by the process of the pool to launch the processing of mails.
+    """
+    mgr = MaildirManager()
+    mgr.process_mail(mail_file_name=file_name)
+
diff --git a/distro_tracker/mail/tests/tests_maildir_management_command.py b/distro_tracker/mail/tests/tests_maildir_management_command.py
new file mode 100644
index 0000000..b32f887
--- /dev/null
+++ b/distro_tracker/mail/tests/tests_maildir_management_command.py
@@ -0,0 +1,66 @@
+# Copyright 2014 The Distro Tracker Developers
+# See the COPYRIGHT file at the top-level directory of this distribution and
+# at http://deb.li/DTAuthors
+#
+# This file is part of Distro Tracker. It is subject to the license terms
+# in the LICENSE file found in the top-level directory of this
+# distribution and at http://deb.li/DTLicense. No part of Distro Tracker,
+# including this file, may be copied, modified, propagated, or distributed
+# except according to the terms contained in the LICENSE file.
+"""
+Tests for :mod:`distro_tracker.mail.management.commands.tracker_maildir_watcher`
+"""
+
+from django.conf import settings
+from django.utils.six.moves import mock
+from django.test import SimpleTestCase
+
+from distro_tracker.mail.management.commands.tracker_maildir_watcher import (
+                Command as MaildirWatcherCommand)
+from distro_tracker.mail.management.commands.tracker_maildir_watcher import WorkersManager
+
+from mailbox import Maildir
+from email.message import Message
+from email.utils import make_msgid
+
+class MaildirWatcherCommandTest(SimpleTestCase):
+    """
+    Tests for the tracker_mailidir_watcher management command:
+    :mod:`distro_tracker.mail.management.commands.tracker_maildir_watcher`
+    """
+    def test_handle_sets_notifier(self):
+        """
+        This function tests that the handle function sets a pyinotify trigger
+        on call and that the notifier will launch the watch process.
+        """
+        with mock.patch('pyinotify.WatchManager.add_watch') as mock_watcher:
+            with mock.patch('pyinotify.Notifier.loop') as mock_notifier:
+                cmd = MaildirWatcherCommand()
+                cmd.handle()
+
+                self.assertTrue(mock_watcher.called)
+                self.assertTrue(mock_notifier.called)
+
+    def test_handle_calls_feed_notifier_for_new_mails(self):
+        """
+        This function tests that when the daemon is started with new mails in
+        the maildir, the feed_worker method is called.
+        """
+        mdir = Maildir(settings.DISTRO_TRACKER_MAILDIR_PATH)
+        msg = Message()
+        msg.add_header('From', 'John Doe <john.doe@unknown.com>')
+        msg.add_header('To', 'dont.care@unknown.com>')
+        msgid = make_msgid()
+        msg.add_header('Message-ID', msgid)
+        mdir.add(msg)
+
+        with mock.patch('distro_tracker.mail.management.commands.tracker_maildir_watcher.WorkersManager.feed_worker') as mock_worker:
+            # This function needs to be mocked to avoid starting the daemon
+            with mock.patch('pyinotify.Notifier.loop') as mock_notifier:
+                cmd = MaildirWatcherCommand()
+                cmd.handle()
+
+                self.assertTrue(mock_worker.called)
+
+        mdir.discard(msgid)
+
-- 
2.0.0

From 0ee30f0246f0f4988209a4b85b91b954091f6ff7 Mon Sep 17 00:00:00 2001
From: Joseph Herlant <herlantj@gmail.com>
Date: Mon, 14 Jul 2014 01:13:27 +0200
Subject: [PATCH 3/3] Documentation about the new Maildir feature of the
 mailbot

The exim4 configuration has been done and tested, but the postfix configuration
still needs to be tested and completed.
---
 docs/setup/mailbot.rst | 122 +++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 119 insertions(+), 3 deletions(-)

diff --git a/docs/setup/mailbot.rst b/docs/setup/mailbot.rst
index 44bcbc6..b1c0e45 100644
--- a/docs/setup/mailbot.rst
+++ b/docs/setup/mailbot.rst
@@ -27,9 +27,125 @@ choosing. You should modify the following values in
 .. note::
 
    These emails are allowed to be on different domains.
+
+The next step, if you are using the mail management via generic Maildir, is
+to modify the following values in ``distro_tracker/project/settings/local.py``:
+
+* DISTRO_TRACKER_MAILDIR_PATH
+
+  The maildir where the control, package news and other package related mails
+  are received
+
+
+Mail management via a Maildir (recommended)
+-------------------------------------------
+
+In order to have the mails from the Maildir properly processed, the
+``tracker_maildir_management`` daemon should be started using the following
+management command:
+:mod:`tracker_maildir_manager <distro_tracker.mail.management.commands.tracker_maildir_manager>`
+
+This command will handle the mails coming in the `new` subdirectory of the
+Maildir directory previously defined. If this directory do not exist, the
+Maildir will not be created and the daemon will not work properly.
+
+.. note::
+
+  If you go ahead with this mail management method, you whould ensure that
+  the daemon has rights to read and write on all the configured maildir
+
+Prerequisites
+^^^^^^^^^^^^^
+
+This module requires ``pyinotify`` and ``multiprocessing`` modules to run.
+
+Exim4 configuration
+^^^^^^^^^^^^^^^^^^^
+
+Mails received by the ``DISTRO_TRACKER_CONTROL_EMAIL``, the bounce, news and
+other package related mails shoud be redirected to the
+``DISTRO_TRACKER_MAILDIR``. To configure this, you can use this router and
+transport as a simple example::
+
+  distro_tracker_router:
+    debug_print = "R: Distro Tracker catchall router for $local_part@$domain"
+    driver = accept
+    transport = distro_tracker_transport
+
+  distro_tracker_transport:
+    debug_print = "T: Distro Tracker transport for the catchall Maildir"
+    driver = appendfile
+    directory = /home/dtracker/distro-tracker/data/Maildir
+    user = dtracker
+    group = mail
+    create_directory
+    envelope_to_add
+    maildir_format
+    directory_mode = 0700
+    mode_fail_narrower = false
+
+.. note::
+
+   The router should be placed last in the routers section of exim
+   configuration file.
+
+   It is advisable to use the envelope_to_add option to ensure that the real
+   recipient of the mail (even if it's cc or bcc) is correctly identified
+   by the daemon. The fields 'Envelope-to', 'X-Envelope-To' and 'X-Original-To'
+   will be the first to be checked by the daemon when looking for the
+   recipient's email.
+
+
+Postfix configuration
+^^^^^^^^^^^^^^^^^^^^^
+
+.. note::
+
+  This configuration is to be defined. It would be advisable to have the 
+  'X-Original-To' in the final headers list (should be added automatically by
+  postfix, but it's still to be verified). The following configuration is a
+  non-tested draft that needs to be completed to include the redirection of the
+  catch-all to a maildir.
+
+Assuming the following configuration::
+
+   DISTRO_TRACKER_CONTACT_EMAIL = owner@distro_tracker.debian.net
+   DISTRO_TRACKER_CONTROL_EMAIL = control@distro_tracker.debian.net
+   DISTRO_TRACKER_FQDN = distro_tracker.debian.net
+
+The file ``/etc/postfix/virtual`` would be::
+
+  distro_tracker.debian.net not-important-ignored
+  postmaster@distro_tracker.debian.net postmaster@localhost
+  owner@distro_tracker.debian.net dtracker-owner@localhost
+  # Catchall for package emails
+  @distro_tracker.debian.net dtracker-dispatch@localhost
+
+The ``/etc/aliases`` file should then include the following lines::
   
+  dtracker-owner: some-admin-user
+
+Then, the ``main.cf`` file should be edited to include::
+
+  virtual_alias_maps = hash:/etc/postfix/virtual
+
+.. note::
+
+   Be sure to run ``newaliases`` and ``postmap`` after editing ``/etc/aliases``
+   and ``/etc/postfix/virtual``.
+
+This way, all messages which are sent to the owner are delivered to the local
+user ``some-admin-user``, messages sent to the control address, messages which
+should be turned into news items and messages sent to any other address on the
+given domain should be redirected to a single maildir that would be handled by
+the daemon.
+
+
+Mail management through pipes (deprecated)
+------------------------------------------
+
 Management commands
--------------------
+^^^^^^^^^^^^^^^^^^^
 
 In order to have the received email messages properly processed they need to
 be passed to the following management commands:
@@ -44,7 +160,7 @@ means that the system's MTA needs to be setup to forward appropriate mails to
 the appropriate command.
 
 Exim4
------
+^^^^^
 
 Mails received to ``DISTRO_TRACKER_CONTROL_EMAIL`` address should be piped to the
 ``control_process`` command. A way to set this up in Exim would be to create a
@@ -96,7 +212,7 @@ are not recognized. Such router and transport could be::
    This router should be placed last in the exim configuration file.
 
 Postfix
--------
+^^^^^^^
 
 To configure Postfix to forward email messages to appropriate commands you need
 to first create a file with virtual aliases for the relevant email addresses.
-- 
2.0.0


Reply to: