Legitimate Emails Being Dropped by Spamassassin in RHEL5

ver the past few months, an increasing number of customers have complained that their otherwise OK spam filters have started dropping an inordinate amount of legitimate emails. The first reaction is of course to increase the score required to be filtered, but that just opens up for more spam. I looked in the quarantine on one of these servers, and ran a few of the legitimate ones through spamassassin in debug mode. I noticed one particular rule which was prevalent in the vast majority of the emails. Here’s an example:

1
2
3
4
5
...
[2162] dbg: learn: initializing learner
[2162] dbg: check: is spam? score=4.004 required=6
[2162] dbg: check: tests=FH_DATE_PAST_20XX,HTML_MESSAGE,SPF_HELO_PASS
...

4 is obviously quite a high score for an email whose only flaw is being in HTML. But FH_DATE_PAST_20XX caught my eye in all of the outputs. So to the rule files:

1
2
3
4
5
$ grep FH_DATE_PAST_20XX /usr/share/spamassassin/72_active.cf
##{ FH_DATE_PAST_20XX
header   FH_DATE_PAST_20XX      Date =~ /20[1-9][0-9]/ [if-unset: 2006]
describe FH_DATE_PAST_20XX      The date is grossly in the future.
##} FH_DATE_PAST_20XX

Aha. This is a problem. With 50_scores.cf containing this:

1
2
$ grep FH_DATE_PAST /usr/share/spamassassin/50_scores.cf
score FH_DATE_PAST_20XX 2.075 3.384 3.554 3.188 # n=2

there’s no wonder emails are getting dropped! I guess this is a problem one can expect when running a distribution with packages 6 years old and neglect to frequently (or at least every once in a while) update the rules!

Luckily, this rule is gone altogether from RHEL6’s version of spamassassin.

May 26th, 2010