[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: IMAP server to fit this bill?



On Fri, Mar 19, 2004 at 11:22:47AM -0800, Steve Lamb wrote:
> Dave Carrigan wrote:
> >As for putting extra headers into a message, I'm not sure why you think
> >this is a problem. That's what headers are for -- to convey
> >meta-information about a message.
> 
>     Because forwarded messages are not the same as the original message.  
>     If the person forwards it as a MIME attachment, for example, the 
> (re)learned message contains a different set of headers completely as well 
> as a slew of unrelated MIME encapsulation data.  If it is bounced properly 
> the bounce headers are learned as either ham or spam.  This extranious 
> information can lead to false positives or negatives.

Yes, but my understanding of DSPAM is that it doesn't retrain on the
forwarded message. It saves the list of tokens that the original message
generated, linked to a unique key. Then when someone forwards a message
to be retrained, it extracts the key from the forwarded message, and
changes the weights of the tokens associated with that key. This means
you can bounce, forward as mime, forward inline, or just send the key
alone and DSPAM will do the right thing.

> >>It introduces statistics which are meaningless in the final analysis.
> 
> >Not sure what this means.
> 
>     What this means is that even if the ham and spam corpus got the same 
> numver of meaningless statistics to render forwarded/bounces message 
> headers/data as "undefined" and therefore not used it is still data that is 
> being taken up in the classifier's DB.  

Which is 100% true based on your initial assumptions. However, if your
initial assumptions are false, then this is also false. Your initial
assumption is false.

> Granted they can filter on their end but the whole point is that they don't 
> download it.  Filtering comes after downloading.  

This is another mistaken assumption of yours. Filtering does not require
downloading. The whole reason I use server-side spam testing and IMAP is
because I telecommute and do not want to download 100 messages when 99
of them are going to be spam. DSPAM adds a special header to the message
that identifies it as spam, and Cyrus delivers all messages with that
header into a different mailbox than my INBOX. I examine that mailbox
periodically, and after a cursory scan for false positives, I delete
everything else, without downloading anything other than a few message
headers.

>     I did mention elmo as well.  I am not familiar with mutt's IMAP 
> implementation and I'd be willing to wager that it isn't up to par given 
> the preponderance of things mutt does wrong as well as how often most 
> clients get IMAP wrong.

Given the number of mistaken assumptions you've made so far, I'll wager
that this one is mistaken as well.

-- 
Dave Carrigan
Seattle, WA, USA
dave@rudedog.org | http://www.rudedog.org/ | ICQ:161669680
UNIX-Apache-Perl-Linux-Firewalls-LDAP-C-C++-DNS-PalmOS-PostgreSQL-MySQL

Dave is currently listening to Mary-Chapin Carpenter - Come On Come On (Come On Come On)

Attachment: signature.asc
Description: Digital signature


Reply to: