A Tour through the SPAM Fighting State-of-the-Art

post-thumb

I’ve been a postmaster of a self-hosted email service for more than twenty years now. During that time, I’ve had three priorities towards my users:

  1. Making sure that the emails my users send out get delivered.
  2. Employing a minimalist, yet effective strategy to fighting incoming SPAM.
  3. Protecting my users’ privacy.

Delivering on this promise has become exponentially harder over the years and in this post, I document a toolchain I ended up using to meet my priorities as a service provider.

This write-up is intentionally opinionated and mildly rebellious, given my aversion to the futile exercise of patching "security" onto irreparably broken protocols like SMTP. If you’re prone to getting your feathers ruffled, please, by all means, set this write-up aside and engage in something more soothing, like knitting or polishing your collection of porcelain mice.

Good Old PTR Record

If we think about incoming email messages from peer MTAs, Mail Transfer Agents, one effective measure in cutting off SPAM has been the requirement for the sending server to have a valid reverse domain name, i.e., DNS PTR record.

Rejecting email from servers which do not have a PTR record or whose PTR record clearly has nothing to do with the email service they purport to represent has actually been an effective frontline defense in fighting SPAM.

To this end OpenBSD’s smtpd offers two nifty utilities for rejecting spammers outright, which are the following:

  1. filter-rdns, which performs a reverse lookup on the sending MTA’s IP and if the IP does not have a PTR record you can make a policy decision on whether to accept such SMTP connections.
  2. filter-fcrdns, which also uses a PTR lookup on the peer MTA’s IP address, whereafter it verifies that the reverse DNS name resolves back to the original IP address.

I use both methods to rejct connections and chain them up, which cuts off lot of the incoming SPAM outright, especially from compromised home user DSL lines.

Senderscore for the Naughty or Nice

My third line of defense against incoming SPAM is a filter utility that builds upon IP reputation lists and is called senderscore. It is a reputation service, which assigns a score from 0 to 100 for any given sender IP. Given the score, you can then make policy decisions to reject emails from hosts with a low score such as 10 or below, or flag them as SPAM through the X-Spam header if their score is less than 70 for example.

Greylisting to Waste Spammers’ Time

My fourth line of defense against incoming SPAM is a method called greylisting, which in practice means that you maintain an allowlist of IPs which are allowed to communicate with you. If an SMTP connection is originating from an unknown IP address, the source will get a temporary 4xx error code on their SMTP connection.

Legitimate email servers will accept this failure and wait for a number of minutes before retrying the connection. In practice, this cuts off a lot of SPAM, since especially drive-by spam bots do not bother retrying the connection. I’ve been using this method as a primary anti-spam solution for more than 15 years now.

The only change I’ve had to implement over the years is moving greylisting further down the processing pipeline. This adjustment was necessary because, nowadays, many legitimate email services do not recognize the concept of a 4xx error code.

I am looking at you Sendgrid and your spammy bretheren.

“Email Security”

Above, I’ve detailed some of the techniques you must employ, if you offer an email service to prevent your users’ becoming inundated with SPAM. Now we turn our eyes towards "email security" solutions, whose aim is mostly to prevent email spoofing.

Almost all of them are based on using funky DNS TXT records sprinkled with PKI fairy dust.

For years, I refused to use all but one of the techniques (SPF), since in practice they do not solve the email spoofing problem and often make email much harder to deliver from A to B. In practice spammers use the same techniques to make sure that their SPAM is delivered to your users, so the only actual reason for using them is to make sure that the legitimate email from your users gets delivered to wherever it is bound.

I am looking at you Microsoft.

An SPF Crash Course

SPF stands for Sender Policy Framework and is defined in RFC 7208. What the standard means in practice is that upon receiving an SMTP connection from a given domain the receiving MTA checks for a specific DNS TXT record for that domain.

For example the SPF record for inform.social is as follows:

$ dig inform.social TXT | fgrep spf
inform.social.          280     IN      TXT     "v=spf1 mx -all"

What the resource record tells us is which MTAs can actually send out email for this domain. In this case the v=spf1 tells us that the record adheres to the SPF version 1. mx on the other hand informs us that the hosts attributed with the MX record can send out email for this domain. -all tells us that all the rest are not allowed to send out email pretending to be mail exchangers for inform.social.

This should mean that an MTA receiving connection from an unauthorized sender, should reject the email outright. In essence, the idea is to prevent email spoofing, but in practice it weighs in quite little to that end. Very few email service providers actually reject email based on whether the SPF record checks out or not. It is used as one element in the chain to deduct the reputability of the sender.

The various mechanisms to denote permitted senders are the following: ip4, ip6, a, mx, ptr, exists and include. Most of these are probably self-explanatory, but for example mechanism for a surprisingly includes also AAAA records. The exists mechanism allows for complex expressions, where the existence of a given record is checked. Finally, the include mechanism on the other hand allows checking for another domains SPF to validate the given domain’s SPF.

DKIM for the Win, right?

The DomainKeys Identified Mail Signatures is another standardization attempt to prevent emails spoofing and it is defined in RFC 6376. How it works is that the sending MTA adds a cryptographic signature in an email header called DKIM-Signature. The idea is that if the check fails, the email can be rejected because the check failed.

  • What does a domainkey TXT record look like in practice?
$ dig default._domainkey.inform.social TXT +short
"v=DKIM1; k=rsa; "
"p=MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAxmv9pBrtzGjtMBG5LTfeHwfcZOn/56QIe/WW0Yp0kgkdjqHQu5QSJUANEly9wNs2YeckSPxeFRfPl7Pb6JlC+t88dCV/RDiqg9kSmv6YSOtVcD9BZlYYUwbMqlKT4M2pkhack4E8oZ89jP01WnYE0nT8S6oFpPpzWoGYnUaAj2ekeLp2X2s2H43G/W3+kiONtvRhz/xcuICgr9vBU"
"PPBHnPkq0Lly8HN5fg+Egkr34YDw93gr9sgSTEZ4jCgvCrH39BPfyy8PWMLAF/aNwx/BTL/ew2NRVQVu9glJYaQ2U2wirSW82Rly/WQhML/uXr0CXhZgexaJT/e90rkT/cBlQIDAQAB"

Here we have some simple tags that help us interpret the result. v=DKIM1; means that we’re dealing with DKIM version 1 and k=rsa means that the DKIM signature algorithm is RSA. The tricky bit in the whole setup is knowing the selector to query for, which is detailed only in the email header. In this case the _domainkey selector is called default, hence to attempt to validate, we must query a funky looking domain name default._domainkey.inform.social.

  • How does the validation then work in practice?

Below is a screen capture of an email that I sent to myself to record the DKIM-Signature header.

An example DKIM signature.
An example DKIM signature.

A DKIM interpretation cheatsheet:

v: Version of DKIM.
a: Algorithm used for the digital signature (typically RSA-SHA256).
d: Domain that signed the email.
s: Selector specifying which public key to retrieve from DNS.
h: List of headers included in the hash.
bh: Base64 encoded hash of the body.
b: Actual signature data.

The v tag stands for version and the a tag references the public key signature algorithm. The d tag will inform you of the domain to query for and the s tag names the selector to prepend before the ._domainkey part. The h tag then informs you of which parts of the email have been hashed and the bh is the hash of the body before the email was signed with the DKIM signature. The actual signature is then denoted via the b tag.

  • Simple, right?

As with all PKI fairy dust that has been sprinkled on top of each outdated protocol such as SMTP, this fix does not really solve email spoofing either. The best DKIM can do for you as a postmaster is to make the deliverability of your emails towards third parties somewhat better.

  • It doesn’t really give you much in terms of preventing third parties spoofing email in your name, as is demonstrated by DKIM replay attacks for example.

DMARC, the Ultimate Anti-Spoofing Solution?

Domain-based Message Authentication, Reporting, and Conformance, DMARC for short is defined in RFC 7489. It picks up where SPF and DKIM left off and lets you put a stop to email spoofing once and for all.

  • Unpopular opinion: no it does not.

This anti-spoofing initiative also relies on oddly formatted TXT records, such as this one:

$ dig _dmarc.inform.social TXT +short
"v=DMARC1; p=none; rua=mailto:[email protected] adkim=r; aspf=r"

Yet again, the v tag denotes the version, which for his protocol is also version 1. The p tag on the other hand informs you how far along the domain owner is in the DMARC journey: none means at the beginning, quarantine means that failed DMARC checks should be treated as suspicious and reject means that a failed check means a rejected email. The additional adkim and aspf tags denote how you should treat failed dkim and spf checks respectively. r means relax and s means strict.

  • The rua tag, on the other hand, is somewhat useful. It allows you to start receiving XML-formatted reports from other email service providers. These reports show which servers have sent email on your behalf.

The idea is that once you are quite certain that email delivery for your domain is DMARC-enabled, you should toggle the p switch first to quarantine and then to reject. This approach might work in an ideal world, where everybody’s eating the same mushrooms, but in the real world the best a postmaster can hope is get their users’ email delivered. Adding another layer of rejection will just make your users miserable.

I am still looking at you Microsoft.

One of the main difficulties with DMARC is that it is very difficult to get right and the adoption still isn’t that high to be effective. If you don’t believe me, read this write-up on the Bisight blog for example.

I quote:

A key issue with DMARC concerns the underlying assumption that a legitimate email will not change during transit.

And before you wonder, no, implementing ARC is not the ultimate solution either.

Rspamd, a Nice Solution to Deal with All this Madness

Recently, I ended up moving my self-hosted email service from one provider to another. The main reason was that the IP reputation of my former cloud provider caused a lot of delivery problems for my users, because Microsoft has become very aggressive in blocking connections from whole swaths of the Internet, whenever a single IP on a netblock is used for spamming.

With the move, I deployed all the DNS/PKI fairy dust and now my users can happily get their email delivered to Microsoft users and I can concentrate on more meaningful things with my limited time budget.

Upon researching the state-of-the-art for solutions that work with opensmtpd, I stumbled upon this great write-up from Joel Carnat. It introduced me to a powerful anti-spam solution called rspamd, which integrates nicely with opensmtpd and dovecot.

I implemented most of the stuff Joel detailed in hist post in a similar fashion and one of the great things with his solution is that I was able to have better control over which connections get greylisted – which was causing me and my users a lot of headache when signing up for new online services for example.

Rspamd filtering statistics.
Rspamd filtering statistics.

Nowadays, roughly 18% of all the incoming connections, which pass the DNS checks get greylisted, which is a great improvement when compared to the situation when greylisting was the first line of defense. In addition, a lot of the malware and phishing SPAM gets flagged as spam or rejected outright, which makes life easier for everybody. 76% of ham seems to be a quite stable figure and thanks to the spiffy rspamd web UI, I have easy access to situation awareness and problem resolution for my email service.


Give Us Feedback or Subscribe to Our Newsletter

If this write-up pushed your buttons one way or another, then please give us some feedback below. The easiest way to make sure that you will not miss a post is to subscribe to our monthly newsletter. We will not spam you with frivolous marketing messages either, nor share your contact details with nefarious marketing people. 😉



comments powered by Disqus