[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Spam in bugs database - automatic removal?



On Thu, Feb 13, 2003 at 11:50:19AM -0500, H. S. Teoh wrote:
> On Thu, Feb 13, 2003 at 02:30:18PM +0000, Colin Watson wrote:
> > reduced. For example, we recently dropped the SA threshold to 4 in
> > response to a combination of Santiago's bug report and an onslaught of
> > spam last night which scored between 4 and 5.
> 
> It might make sense to use an adaptive spam filter (like bogofilter) in
> combination with SA.

Unfortunately there's no easy mechanism to get such adaptive feedback
back to bugs.debian.org, and it's very likely that a large percentage of
people receiving bug mail wouldn't send feedback even if there were such
a mechanism (after all, many of them are inactive). From what I can
tell, bogofilter-like systems are dependent on good feedback that covers
at least a representative cross-section of the mail that passes through
them, and I don't think we can reasonably achieve that.

> However, I also found that score 4 also introduces too many false
> positives, so I spent a long time tweaking scores and creating custom
> rules.

bugs.debian.org has certain advantages here in that valid mail tends to
follow fixed formats. We've scored down /^Package:/ and X-Debbugs-Cc:,
and Bug# is already scored down by SpamAssassin.

-- 
Colin Watson                                  [cjwatson@flatline.org.uk]



Reply to: