This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

What you need in an email spam filter

I now have management responsibility for 5 different products with email filtering capability.     For my primary network, I use three incoming spam filters, configured in sequence.  UTM is the last, because it has the weakest capabilities.    I did not buy UTM for its email filtering, and probably neither did you.   It probably was not even high on the evaluation criteria when Sophos bought Astaro.

Sophos has, by my count, 7 primary email filtering products:   two embedded (PureMessaging for Unix and Exchange),, three appliances (UTM, XG, Sophos Email Appliance), and two cloud products (Reflexion and Sophos EMail Security).    Sophos Email Security is currently their flagship product, and probably the best.   I have also tried to evaluate a bunch of its competitors.   I am pretty unhappy with the state of this industry.

First, the sales cycle tends to make you believe that if you buy the right product, you can stop looking at your mail stream, but this is wrong.   Vendors can potentially identify malware, but they cannot identify your unwanted mail stream.   For example, we do not want to allow recruiters to use our mail system to lure away our key employees.    We also want to block the avalanche of irrelevant advertising, even if it is technically harmless.   Our list of unwanted senders is different than yours, so this is not a job that can be delegated to your vendor's army of experts.

To evaluate your mail stream, you need at least this much identity information about the message:    Source IP, ReverseDNS, Helo/Ehlo host name, Envelope-From address, Message-From address, Envelope-To, possibly Message-To, disposition, and reason for disposition.   Here are some real-world examples, sanitized:

Case #1

Example.com has moved their mail to Outlook.com, but has not yet updated their SPF record.    I want to whitelist messages that (a) Envelope-From is @example.com, (b) Message-From is @Example.com, (c) ReverseDNS ends with ".outlook.com" and (d) the Reverse DNS can be forward-confirmed to the Source IP.    The Forward-confirm test is needed to protect against the possibility of a fraudulent Reverse DNS entry (rare but possible).   Of course, I can only know that outlook.com servers will always forward-confirm because this test is performed and logged on every message, for both the Reverse DNS name and the Helo name.

Case #2

Mass mailers (mostly but not exclusively advertising messages) will generally use their own domain for the Envelope-From, so the message passes SPF, but use the client domain in the Message-From address.   Some of them have pretty good client control, but some do not.   In my most loathed example, one mass mailer sends essential messages from a few legitimate companies, a lot of nuisance advertising, and some criminally fraudulent messages.    We whitelist the message from their known-good clients, block messages from their known-bad clients, and place everything else into a system-level quarantine (where it will be discarded after a time delay, but the delay allows us to check for any desired senders that need to be added to the known-good list.)

Case #3

One of my spam filters will collect one row of data about every message.  The data includes lots of very useful data, more than most other products, but some messages are still ambiguous.   To decide if it is wanted or unwanted, I need to look at the message headers and the message body.   A message log that has no visibility to the entire message is woefully inadequate for allow/block decisions.

Case #4

I am hammered by unwanted advertising for some innocuous product, such as toothpicks, which is getting through the spam filters (all three of them!)   On review, I see that the messages have 20 different subject variants, over 100 different domain names, and over 100 different source IPs.   When I start drilling back, I see that this same group of senders was responsible for two other similar campaigns.   I use my message log to build a list of all of the Source IPs, all of the domain names, and all of the host names.   I add a subject content filter for some of the terms, but first I need to configure exclude rules for the senders that should not be swept up by this rule.  Then I want to put all of the unwanted IP addresses, host domain names, and sender domains into my block lists.    Since I have stumbled on a botnet, maybe I should move all of the involved message files into a special folder so that I can give it to law enforcement.

Partial list of feature tests for a good spam filter:

Captures all of the key message identity information (including Message From and HELO) into a log, so that message history can be queried using SQL or SQL-equivalent capabilities.   Also allows filter rules based on all of the identity attributes.

Allows entire messages to be logged somewhere for drill-down during message log review

Allows multi-factor filter rules, especially for any rule that relaxes filtering.

Granular actions - I should be able to bypass the "toothpick" rule without bypassing any other test.

SPF / DKIM / DMARC primarily used for allowing trusted senders, based on PASS, rather than assuming that FAIL indicates an untrusted sender.    A lot of SPF / DKIM / DMARC "failures" are false positives, and for most sites the false positives will be overwhelming.    Too many products do not offer multi-factor exceptions, and do not give visibility to all SPF status codes, so you cannot construct a rule based on sender domain plus SPF PASS plus source host.

I spent six months looking at many commercial products for features like these, and not finding it. The ones that might have been close were outrageously expensive, so they did not get a complete review.   Eventually, I stumbled on an old and not widely used product which is currently available for free.   That product is now my first-line spam filter, another commercial product is second, and UTM is third in my inbound filtering chain.   The first product is best at source filtering because it has a great rules engine.   The other two products are better at finding unwanted content based on proprietary filter rules.   The combination is better than any of them individually.



This thread was automatically locked due to age.
Parents Reply
  • I have no knowledge of that product, so I may benefit from studying its documentation

    If you use it, I would be pleased to have a description of your implementation srategy.  How do you use the tools that it provides to allow the good stuff, block the bsd stuff, and determine the disposition of uncettain stuff.

    To the extent that it does not meet al of your needs, what else would you want in an ideal product?

    Documenting what users need will help Sophos build bettwr products and help community members shop for products.

Children
No Data