The war on spam

I spent a chunk of the weekend working on impoving caliban.org‘s anti-spam measures. I’d been running with the same configuration for quite a while, which was basically a basic Postfix configuration on the front-end and SpamAssassin on the back-end.

This worked well, but I decided it was time to consider using some real-time blackhole lists (or RBLs, as they’re called). Since MAPS turned into a subscription-only service, I haven’t been using blackhole lists for the outright rejection of mail, as I didn’t think any of them were as trustworthy as MAPS.

I haven’t had to do much professional SMTP server administration since 2000, so it had had been quite a while since my last assessment of other RBLs. In light of this, I decided to have a look at how professionally the currently active RBLs are administered.

I was pleased by what I found. A few of the lists appear to be very professionally administered, so I decided to use them for front-end mail rejection as opposed to just back-end spam tagging. Previously, I was using several blacklists in combination with SpamAssassin to tag messages as possible spam, but the messages were then subject to many other tests to determine whether they exhibited other spam-like characteristics.

Using up-front rejection, however, the mere presence of the sending host on one of these lists results in an immediate rejection of the message being offered. There’s no second chance to inspect the actual header and body of the e-mail, so one needs a very high degree of confidence in the quality of the blackhole lists being used as the basis for this decision.

SpamAssassin’s use of blacklists is a little different, anyway, as it will check the host in the chronologically first Received line to see whether the originating host is on a blacklist. Envelope-time rejection, however, as performed by an MTA, checks only the IP address of the connecting host and possibly the domains that it claims in the HELO and MAIL FROM lines.

I upgraded to Postfix 2.1 a few weeks ago. After this weekend’s fine-tuning, the relevant part of main.cf now looks like this:

strict_rfc821_envelopes = yes

strict_mime_encoding_domain = yes

Use this when reject_unknown_hostname is true

unknown_hostname_reject_code = 550

Use this when reject_unknown_sender_domain is true

unknown_address_reject_code = 550

Use this when reject_unverified_sender is true

unverified_sender_reject_code = 550

smtpd_client_restrictions = permit_mynetworks

Lots of ‘good’ sites have broken reverse DNS

reject_unknown_client

smtpd_helo_required = yes

smtpd_helo_restrictions = permit_mynetworks

reject_unauth_pipelining

reject_invalid_hostname

reject_non_fqdn_hostname

This traps too many poorly configured good guys

reject_unknown_hostname

smtpd_sender_restrictions = permit_mynetworks

smtpd_recipient_restrictions = permit_mynetworks

reject_unauth_destination

check_recipient_maps

reject_multi_recipient_bounce

reject_non_fqdn_sender

reject_non_fqdn_recipient

reject_unknown_sender_domain

check_client_access hash:/etc/postfix/client_access

reject_rbl_client bl.spamcop.net

reject_rbl_client dnsbl.sorbs.net

reject_rbl_client rhsbl.sorbs.net

reject_rbl_client sbl-xbl.spamhaus.org

reject_unverified_sender

address_verify_map = btree:/etc/postfix/verify

In the above, I have optimised the order of the controls for network traffic and, by extension, response time. For example, there’s no point running most of the tests if we already know that the intended recipient does not even exist. For this reason, most tests are deferred until the RCPT TO comes in.

First of all, we require an SMTP HELO and demand that the other side strictly comply with RFC821. For good measure, we also disallow unauthorised pipelining of SMTP commands. This, alone, will catch some very poorly written spamware in the act.

Next, when the HELO arrives, we check for a validly formed, fully-qualified hostname. Again, a few clueless spammers may be caught out here, but there are no significant gains. Ideally, we’d also reject anyone who passes us a hostname with no matching A record in DNS, but plenty of good sites don’t have their act together here and it quickly became apparent that I was going to reject a lot of good mail with this in place.

Next, we let the MAIL FROM stage pass without action, as we are waiting for the RCPT TO before performing our main suite of tests. Once we get the data from the RCPT TO, we’ll perform our MAIL FROM checks only if we decide we still need to.

At the RCPT TO stage, we order the tests for minimum network load and processing. We check that the remote side is not trying to relay and then the intended recipient does, in fact, exist. Next, we check for a fully qualified sender domain in the MAIL FROM as well as a fully qualified recipient in the RCPT TO. We also refuse any e-mail that is a multi-recipient bounce, which is another technique used by spammers for squeezing their messages into your server on a technicality. None of these checks requires any additional network traffic.

Next, we check DNS for an A record for the sender domain. If we find one, we then check a whitelist of domains that we always want to accept mail from. Basically, we want to spare them from the upcoming tests, which we can’t guarantee they will pass and we always want to receive e-mail from these systems.

Now, the tests get a little heavier on the network. In turn, we check SpamCop, SORBS and Spamhaus for the IP of the remote host. If found, we reject the message on the spot. From my research, these lists demonstrate that they are fairly administered and that the chance of false positives is small. That wasn’t always the case with SpamCop, but they seem to have improved a lot in recent times.

Any mail that has made it this far is doing pretty well, but there’s one final test we subject it to at this point. We take the address in the MAIL FROM and use it to make an SMTP connection back to the sender, going as far as issuing a RCPT TO with that address, but not following up with the usual DATA section.

The purpose of this exercise is to ascertain whether the ostensible sender has a real account that can be delivered back to. A sender who cannot receive replies is considered spam and we reject the original incoming message. A cache of legitimate sender addresses is built up in order to minimise the amount of work that is needed for the verifcation process.

The directives that set reply codes to 550 are there to ensure that all rejections are permanent. Postfix defaults to a 450 response on these types of rejections, which signifies a transient error. That means the remote host is basically told to correct the problem and try again. Of course, that will never happen, so we cold-heartedly reject the message as soon as we find a problem with it. Otherwise, the remote side will periodically try to resend it and we will keep on needing to check it and reject it.

Since instituting these changes, I’ve been rejecting an awful lot of e-mail. Some of it is e-mail that SpamAssassin would previously have trapped anyway, but now I’m preventing it from ever entering the system. It’s better to reject spam at SMTP-time, rather than accepting it into the system, as the latter course of action is rightfully interpreted by spammers as successful delivery. They don’t care that the mail was later filtered and not read; they care only that it was accepted by the receiving server.

Postfix really is a superb piece of software and I highly recommend it. Few MTAs offer so much fine-tuning in the fight against spam, whilst still maintaining legible configuration files.

This entry was posted in This Site. Bookmark the permalink.

1 Response to The war on spam

  1. Geoff says:

    I got spam with this URL in it. Thanks to your advice, I hopefully wont again!

    Cheers!

Leave a Reply

Your email address will not be published. Required fields are marked *