Re: Parallelizing fetchmail

To: Daniele Cortesi <dan@linux.it>
Cc: debian-user@lists.debian.org
Subject: Re: Parallelizing fetchmail
From: Rich Johnson <rjohnson@dogstar-interactive.com>
Date: Sun, 21 May 2006 14:22:29 -0400
Message-id: <[🔎] B505090D-9EC1-49E2-8181-5EA05EE17172@dogstar-interactive.com>
In-reply-to: <[🔎] 20060521165244.GD9156@smtp.tiscali.it>
References: <[🔎] 20060521165244.GD9156@smtp.tiscali.it>


On May 21, 2006, at 12:52 PM, Daniele Cortesi wrote:

Hello *,
 I recently uninstalled exim on my home pc, replacing it with esmtp
for outbound mail and fetchmail->procmail for inbound traffic.

Procmail checks every message for spam and viruses, introducing some
seconds of latency, mainly because of DNSRBL checks of spamc.

The disadvantage of this is that fetchmail launches only one procmail

for each message and waits for it termination. This leads to a verylong

delay when downloading many messages.

I can parallely check more than one message with spamd (making itcreate

more childs) but I cannot find a configuration that will speed up with

more spamd-child. The bottleneck is always fetchmail that processevery

message one by one.

Have you got any ideas about how to insert a queue in the chain?

I can replace procmail with maildrop or similar if necessary. Please
avoid solutions like "re-install exim" or "install <insert your
favourite mta here>".


How much of a delay are you experiencing?
Are these messages all coming through one popbox?

Others may have better info, but I don't think you can run fetchmailin parallel--at least not more than one process per user.

From "man fetchmail"

Only one daemon process is permitted per user; in daemonmode, fetch-mail makes a per-user lockfile to guarantee this.


I do have two ideas though (N.B. may substitute IMAP for POP):

A: Set up multiple virtual users "fetchm_1,..., fetchm_n" allfetching from the same popbox and run a daemon for each of them. I'dbe careful though--having multiple processes writing to the same mboxfiles is probably asking for trouble.

B: Use intermediate popboxes as queues--essentially establishing amulti-stage dataflow:1. fetchmail/procmail to distribute incoming messages tomultiple local popboxes (mailq_1, ..., mailq_n)2. n fetchmail/spamc daemons running on each popbox to filterspam (mailq_i -> mailq_nospam_i ) These will run in parallel.3. 1 fetchmail/procmail daemon to collect and redistribute themessages.This approach will require a pop server on your local machineas well as virtual users for each of the mail_q daemons.


I kinda like (B) because:
 - The queues are explicit.

- Distribution can be configured as either a dumb dealer, or asubject /priority sorter.

 - The spam filtering can be scaled, or off-loaded to another machine.

- Distribution and collection processes are disjoint. They _could_be performed by a single fetchmail daemon.


Of course these are just theoretical ruminations....do you feel lucky?

--rich

Reply to:

Follow-Ups:
- Re: Parallelizing fetchmail
  - From: Jon Dowland <lists@alcopop.org>

References:
- Parallelizing fetchmail
  - From: Daniele Cortesi <dan@linux.it>

Prev by Date: Re: ripping CD with SACD format seems impossible
Next by Date: Re: Hardware RAID: Compaq Smart Array 64xx
Previous by thread: Parallelizing fetchmail
Next by thread: Re: Parallelizing fetchmail
Index(es):
- Date
- Thread