Resposio eMail Server Upgrade Information

The Greylisting Method High Level Overview

Greylisting got its name because it is kind of a cross between black- and white-listing, with mostly automatic maintenance. A key element of the Greylisting method is this automatic maintenance.

The Greylisting method is very simple. It only looks at three pieces of information (which we will refer to as a "triplet" from now on) about any particular mail delivery attempt:

The IP address of the host attempting the delivery
The envelope sender address
The envelope recipient address

From this, we now have a unique triplet for identifying a mail "relationship". With this data, we simply follow a basic rule, which is:

If we have never seen this triplet before, then refuse this delivery and any others that may come within a certain period of time with a temporary failure.

Since SMTP is considered an unreliable transport, the possibility of temporary failures is built into the core spec (see RFC 821).

As such, any well behaved message transfer agent (MTA) should attempt retries if given an appropriate temporary failure code for a delivery attempt (see below for discussion of issues concerning non-conforming MTA's).

During the initial testing of Greylisting in mid-2003, it was observed that the vast majority of spam appears to be sent from applications designed specifically for spamming. These applications appear to adopt the "fire-and-forget" methodology. That is, they attempt to send the spam to one or several MX hosts for a domain, but then never attempt a true retry as a real MTA would. From our testing, this means that in the test environment, based on a fairly conservative interpretation of testing data, we have attained an effectiveness of over 95%, and that is with no legitimate mail ever being permanently blocked.

In addition, with the recent rampant proliferation of email-based viruses, Greylisting has been shown to be extremely effective in blocking these viruses, as they also do not tend to retry deliveries. And since these viruses are fairly large, bandwidth and processing savings are significant versus the standard method of accepting delivery and local virus scanning.

This blocking comes with a minimal price from the terms of local resources. Assuming the use of a local datastore for the triplet and other metadata, there is no required network traffic caused by Greylisting other than that associated with the connection itself. Since we are not checking the contents of the message at all there is very little processing overhead, unlike many other spam blocking methods.

There is one effect that could be seen as either a positive or negative. Since the Greylisting method delays acceptance of unknown mail, that will generate a little more work for the sending MTA of legitimate mail. The flip side is that it generates a lot more work and smarts for the spammer's systems, hopefully enough to make the costs of spamming higher, possibly even to the point of making spamming unprofitable for some of them.

The best part is that since we never permanently fail a message delivery, as long as the delivering MTA's are well behaved, we should never cause a legitimate mail to bounce. There should never be a false positive! Implementation Specification

In order to implement the Greylisting method, we will use some form of database to hold a few pieces of information about a specific mail relationship that is keyed off of the triplet described above:

The time that the triplet was first seen (record create time)
The time that the blocking of this triplet will expire
The time that the record itself will expire (for aging old records)
The number of delivery attempts that have been blocked
The number of emails we have sucessfully passed

(Note: There are some additional pieces of information that are stored and used in the example implementation, and they will be discussed later, but for now we will disregard them. Also, the number of email attempts blocked and passed is not strictly necessary, but will be shown to be useful in making the process work better.) With this data, we have everything necessary for a fully functional Greylisting implementation.

The proper place in the SMTP session to perform our checks is as soon as possible in the mail session when we have all of the needed information available. To remind those who are not familiar with the low level details of an SMTP session, a normal command sequence would look something like:

This means, in order to minimize the network traffic required when a mail delivery may be rejected we should perform our checks as soon after the sending MTA has given us all the required information, which is to say, immediately after the RCPT command is received. In the case where we would temporarily fail a particular delivery attempt, the mail transaction would look similar to this:

One additional feature which has not yet been mentioned is the provision for some method to allow manual whitelisting of relays, recipients, and possibly even senders.

This manual whitelisting capability is not strictly necessary, but for several reasons, a minimum implentation pretty much requires at least manual whitelisting based on IP address for things like localhost, or primary/backup MX hosts for the domains being handled. Since those relays are presumably smart enough to retry, and should never be blocked anyway, there is little point to delaying mail delivery attempts from them.

Likewise, whitelisting recipients (or recipient domains) may be useful in an ISP or similar setting, where particular customers wish to exempt their domains from the possible mail delivery delays that Greylisting may cause.

Whitelisting based on sender address (or sender domain), while easily implemented, is discouraged. The reasons for this are that in most cases, whitelisting the IP addresses of the mail hosts that send for a particular domain is a much better solution because it is much more difficult to forge the IP address than the sending email address. Also, in most cases, domains or emails that would be likely to be whitelisted would also be very easily guessed or discovered, and spammers could take advantage of that to bypass the Greylisting blocks.

Whether these manual whitelisting entries are stored in the database, or are hardcoded into the application does not matter from the standpoint of Greylisting. But of course, an implementation that allows them to be easily updated is preferable.

The specific methodology for a fairly basic Greylisting implementation is as follows:

Check if the sending relay (or network) is whitelisted, and if so, pass the mail.
Check if the envelope recipient (or domain) is whitelisted, and if so, pass the mail.
Check if we have seen this email triplet before.

If we have not seen it, create a record describing it and return a tempfail to the sending MTA.
If we have seen it, and the block is not expired, return a tempfail to the sending MTA.
If we have seen it, and the block has expired, then pass the email.

If the delivery attempt should be passed and the delivery is successful:

Increment the passed count on the matching row.
Reset the expiration time of the record to be the standard lifetime past the current time.

If the delivery attempt has been temporarily failed:

Increment the failed count on the matching row.
If the sender is the special case of the null sender, do not return a failure after RCPT, instead wait until after the DATA phase.

(Note: For all checks, we ignore records whose lifetime has expired)

Issues Affecting The Proposed Implementation

There are a few issues that were found to be prevalent enough "in the wild" to make it necessary to slightly modify methods in the basic approach.

One issue is that some MTA software (Exim for example) attempts to limit the problem of forged sender addresses by attempting to verify that the claimed sender of an email is a valid address by doing an SMTP callback before accepting mail. Since it is desired to minimize the traffic when a mail may be rejected temporarily, the best course of action would be to issue a tempfail after the RCPT command. However, in the case of a SMTP callback, doing so at that point may cause our outgoing mail to be delayed unnecessarily.

Luckily, most mailers that do this use a sender address of the null sender "<>" to perform this check. This makes it fairly simple to workaround, since we can make a modification to the handling process so that in the special case of the null sender, we delay returning a temporary failure until after the DATA phase of a mail transaction. Since SMTP callbacks abort their test delivery attempt before getting to the data phase, the SMTP callback will succeed, and the outgoing mail should be accepted with no delay.

One mailer that seems to have a related problem is Postfix. Postfix breaks from the normal procedure of using the null sender for its callbacks, and instead uses a configurable sender address in the callback. I tried to get an explanation as to why Postfix didn't use the null sender like other mailers, and was informed that it was because some broken mailers don't accept the null sender even though it is required in the SMTP RFCs.

Unfortunately, this causes a problem when trying to work around the wierd behavior of Postfix. Luckily, the default setting for this address seems to be "postmaster", which leads to an acceptable workaround.

Another issue occurs when a large organization uses a pool of outbound mail servers for sending email to a system using Greylisting. If the pool is configured so that the same mailserver (with the same IP) will always retry deliveries for a particular mail, there is no issue.

But if that pool of mail servers happens to be configured in such a way that subsequent delivery attempts for a particular mail may be made from any one of several sending MTAs, then we have a possibility where legitimate mail deliveries may take significantly longer than expected. The possible maximum delay is dependant on the number of MTAs in the sending pool, and if the distribution of the retry attempts is random or deterministic. In a worst-case scenario, it is even possible that mail may be delayed long enough to cause it to bounce.

Other than adding a manual entry for networks of this type, one proposed method of dealing with this issue is to perform the IP address checks of the sending relay based on the subnet they are at rather than the specific IP. Since most of the sites that do this have most or all of their email servers on the same /24 subnet, this method works well in avoiding this issue without requiring manual intervention, at the expense of making it a little easier for spammers to circumvent the system.

One other potential issue is with mailing lists that use unique envelope sender addresses for mail sent to an end user, which is useful in order to better track bounces, since the formatting of bounces is not codified, and it is fairly common for mailers to return bounces that are formatted in such a way that it is very difficult, or even impossible, to programmatically discover which address caused the bounce.

This method of handling bounces is called VERP for Variable Envelope Return Paths, and one method of doing this is detailed here. Luckily, most mailing lists do this in a way similar to that described in that document, which is to use the same unique envelope sender for every mail sent to a particular recipient.

However, some mailing lists (such as Ezmlm) also try to track bounces to individual mails, rather than just individual recipients, which creates a variation on the VERP method where each email has its own unique envelope sender. Since the automatic whitelisting that is built into Greylisting depends on the envelope addresses for subsequent emails being the same, this will cause each email sent to be delayed, rather than just the first email.

While tracking individual bounces may sound like a good idea, in today's internet age when we are trying to authenticate the senders of email, it's probably a bad idea. Hopefully, the Ezmlm maintainers will correct this issue.

There is a simple workaround, which is to manually whitelist any hosts that deliver this sort of traffic. But luckily, even without manual whitelist entries, the impact is not that significant since mailing lists are usually not that timely in their delivery anyway, and the delay will generally not be very significant for most users.

Basic Configuration Parameters

In the spirit of giving the mail system administrators who choose to implement Greylisting as much choice as possible, there are several options which should be easily modified in order to tune the behavior of the Greylisting method on a per-case basis. Below, we detail these options, and some details to keep in mind if it is deemed necessary to change them from the default suggested values. As a matter of fact, it may be desirable to vary these settings from installation to installation, since it will help keep the spammers guessing.

Initial delay of a previously unknown triplet: 1 Hour
Lifetime of triplets that have not yet allowed a mail to pass: 4 Hours
Lifetime of auto-whitelisted triplets that have allowed mail to pass: 36 Days

The initial delay of 1 hour was picked for several reasons:

An hour is short enough that in most cases, users will not notice the delay.
It is long enough to give time for administrators on a possibly compromised or abused mail server to discover the problem and hopefully correct it, before any of the offending email is able to be delivered.
It is long enough to provide a good chance that if the sending host is in fact a spammer, they will be listed in other IP-based blacklists that may be used in conjunction with Greylisting, so that even if a spamming relay later attempts a redelivery that would no longer be delayed by Greylisting, it may still be blocked by other methods.
It is also long enough that other types of traffic analysis could be designed and implemented such that spamming IP's could be easily identified and blocked by other methods, in such a way that even the first recipients (before a spamming pattern starts to emerge) would still not be bothered by the spam email.

The data collected during testing showed that more than 99% of the mail that was blocked with the tested setting of 1 hour would still have been blocked with a delay setting of only 1 minute. However, it is expected that as spammers become aware of this blocking method, they will change their software to retry failed deliveries. At that point, having a larger initial delay will definitely help, as it gives time for other blocking methods to act. For this reason, it is suggested that at least a one hour delay value be kept as a default, since spammers will start adapting as soon as this method becomes known and starts being used.

It is important to keep this delay smaller than a value where a significant number of MTA's will give up and bounce the message. Luckily, most MTA's have failure timeouts of several days. However, there are some special cases like certain financial institutions who want to know that it wasn't delivered in a fairly short period of time. Even in these special cases, the timeouts should be at least a few hours.

It is likely that some form of traffic analysis will be developed using the data from a Greylisting database in order to automatically identify the IP addresses of hosts that are attempting to deliver spam.

While this sort of functionality is not currently included in the example implementation, I would be very interested in seeing this come about, since spammer patterns were usually very identifiable after a few minutes, mainly due to many nearly simultaneous delivery attempts to a large number of different recipients from the same IP address or group of IP addresses, from which no (or very little) previous traffic had ever been observed. (If the organizers or maintainers of any of the DNS blacklists are interested in creating an automatic way of using this data to help update their lists, please contact Resposio.)

Unfortunately, pattern analysis requires a fairly high level of traffic to be useful and accurate, so smaller systems will probably not help much unless the pattern analysis is distributed, which is difficult when you can't necessarily trust other potential collaborators.

The 4 hour initial life of records was picked because:

Almost all legitimate mail servers have a retry time that is less than this.
Having a small lifetime helps limit the number of relevant records that may have to be considered and maintained for very busy sites that may have enormous amounts of mail traffic and hundreds or thousands of queries a second. Small values for this are increasingly important as the spam problem grows, since each unique spam triplet will generate a record.
It was desired to keep the time window fairly small to limit when a possible spam might get through because a spammer may try to resend the message to their entire delivery list.

(Note that in the example implementation, this 4 hour limit includes the initial 1 hour delay, which means the effective window when an email will be accepted is 3 hours.)

There is another reason why this delay should be a small as possible. If a spammer discovers and uses a poorly maintained relay host, hopefully it will bog that relay down enough so that it gets very slow. That increases the possibility that the relay will be slowed down enough that it won't be able to process the queue fast enough for the spam to get through within this time window.

The lifetime limit of whitelisted records is updated every time an email is successfully passed, and was chosen to be 36 days to:

Help keep the database a manageable size by allowing entries for obsolete senders, recipients or relays to be aged off gracefully.
Make sure that records live long enough to avoid delaying subsequent mailings that may only come once a month (i.e. monthly mailing list notifications). Also, to live long enough for monthly mailings that may be sent only on a particular day of the week (for example, the first Monday of the months June and July in 2003 are 35 days apart).