How to use SPF to reject spam

Gentlemen,

I apologize for the long post in advance. I wanted to maintain
context, as opposed to spreading my position over a month long
thread like we did with the DNS loading thread.

It's been said before, SPF is not a spam fighting mechanism. It is a
forgery fighting mechanism.

But let's see how it should be used to aid the spam fighting
mechanisms.

1. Set-up a statistic database on how much spam with PASS you get from
one domain. A simple table that contains [domainname, ham_count,
spam_count} is probably enough, but you can get fancier if you like.

  Using the following
  Based on the stats in the table, create a SA score function:

  if (P>N)
     Pass_SA_Score = ((P-N)/(100-N))*(+5)
  else
     Pass_SA_Score = ((N-P)/N)*(-5)

  Where P = statistical percentage of spam for domain X
          := spam_count/(spam_count+ham_count)

        N = Arbitrary limit of spam you'd allow from legit users
            of one domain. (eg. ebay sends a lot of good stuff
            and a some bad stuff/promotional).
          := I would think 10% is ok. Set your own limit.

        X = domain name in question (MAIL-FROM)

        5 = max range of SpamAssassin score. Other software may use
            percentages, or a wider range. Use that maximum here.

            Eg. My_Spam_Filter_Score = ((P-N)/(100-N))*(+100)
                if it uses +100% to -100% range.
            If the positive and negative ranges are different,
            just replace the -5 and +5 with whatever is appopriate.

        E = margin of error. This should be 0.1%, or lower.
            It will be used to judge how close to the maximum
            positive score you need to get to be sure it's spam.

        SF = probability that a SOFTFAIL result means SPAM.
             eg. for hotmail, this might be 100%
                 for clueless-friend.com, it might be 10%.

  Note that the function above yields a positive score when domain X
  has a history of spam, and negative if they have a (relatively) clean
  history.


2. When you get a message:

   Evaluate the Pass_SA_Score.

   If it is within E of SA max spam score, Do not check SPF, but
   reject safely. You don't expect anything good out of this
   domain. Reset the statistic later if ever you change your
   mind.

   If it is not within E (ie, we're not absolutely sure it is
   spam) check its SPF record. Then, based on the result, do one
   of the following actions below

   switch (check_host(X)) [
      when "FAIL" : Reject.

      when "NXDOMAIN" : Reject.

      when "PermError" : Reject.

      when "TempError" : Reject temporarily (4xx message)

      when "PASS" :
                Using the Pass_SA_score calculated above:
                Evaluate(SpamAssassin, Pass_SA_score, MESSAGE)
                Enter the SA result in the stat table (update):
                  ({domain},spam++,ham)} or ({domain},spam,ham++)}
                Reject or deliver as per SA result.

      when "NEUTRAL" :
                SA_score = Neutral_SA_score_function;
                Evaluate SpamAssassin(Message)
                Enter the SA result in the stat table:
                ("neutral", spam++, ham} or ("neutral", spam, ham++)
                Reject or deliver as per SA result.

      when "NONE" :
                SA_score = None_SA_score_function;
                Evaluate SpamAssassin(Message)
                Enter the SA result in the stat table:
                ("none", spam++, ham} or ("none", spam, ham++)
                Reject or deliver as per SA result.

      when "SOFTFAIL" :
                SA_score =SoftFail_SA_score_function;
                Evaluate SpamAssassin(Message)
                Enter the SA result in the stat table:
                ("softfail", spam++, ham} or ("softfail", spam, ham++)
                Reject or deliver as per SA result.
   ]


So in this way we have an individual spam score for each domain
that emails us, and 3 global spam scores: Neutral score, None
score and Softfail score.

The Neutral_SA_score is calculated using the "neutral" row in the
stats table. Similarly, None_SA_score and Softfail_SA_score will
be calculated from their respective rows.


Some summarizing conclusions can be drawn:

* This is a automatic reputation system in a closed feedback loop.
  Thus, it adjusts itself to the times.

* Each domain that claims responsibility will be treated
  individually, and will have an individual score. Each will have
  its own reputation to uphold or destroy.

* Those domain that do not claim reponsibility in their SPF
  records (ie, neutral/none/softfails) will be treated as the
  rest of the world.

* There is no manual input. No human needs make any decisions
  (save for the N parameter above).

* The non-publishers have a global score. They're treated the same from
  an authentication p.o.v.

* The neutrals have a global score. They're treated the same from
  an authentication p.o.v.

* The softfails have a global score. They're treated the same from
  an authentication p.o.v.


Some typical sender scenarios
=============================

Let's look at a few pathological cases (hotmail.com, aol.com,
comcast.com, myvanity.com, cisco.com, bigger-longer.com):

A. hotmail.com case - publishes with ~all

   This means that users that use their normal interface will get
   treated as per the "hotmail.com" entry in the stat table. Due
   to their anti-spam policy, they will likely have a good
   ham/spam ratio, so as long as the mail comes from their
   servers ("PASS" SPF result), SpamAssassin will probably
   deliver it, as the SA_score be very low.

   Those that send through other channels and spoof the hotmail
   name will get treated as per the 'softfail' entry in the stat
   table. This email will be scrutinized heavily by SpamAssassin,
   as there most softfail-ed mail out there is spam (ie, the
   ham/spam global ratio for the 'softfail' entry in the stats
   table is very low).

   Some forgers will send mail from elsewhere and pretend to be a
   hotmail user. They will be treated as per SOFTFAIL scores.
   This is fine, as hotmail makes no claim to protect its users
   from having their address forged (yet, anyway).

   So, if one wants to forge a hotmail account they will see the
   same resistance from SA as any other SOFTFAIL source. If the
   mail is not the typical spam, but a more targetted
   personalized attack (like Jim pretending to be Bob when
   writing to Alice), it will probably slip past SA, as the
   contents of the message looks believable. Again, hotmail is
   not (yet) in the business of protecting individuals.

B. aol.com -publishes with ?all (so far)

   Users that send through the aol infrastructure will enjoy the
   a SpamAssassin treatment commensurate with aol's policy
   against spam.

   If aol sends very little real spam through its own servers,
   then the stats will be favourable. The aol.com mail will get
   through as reliably as possible. The better job aol does to
   crack on spammers, the higher it's SA_score will be.

   The ?all means that if you use some other servers, but spoof
   the aol.com domain, you'll enjoy the same treatment as the
   other "neutral" policies.

D. nameless_isp_that_favours_spammers.com publishes with -all

   Well, since mostly spam will be coming through here, the
   domain will get a pretty bad reputation. The honest user will
   suffer as their mail is rejected due to their ISP's reputation
   (relatively low ham/spam ratio). So they will eventually
   abandon this ISP.

   As the users abandon the ISPs, the majority of customers left
   will be the spammers. The ham/spam ratio will go down even
   faster, so it will be easier to recognize spam. This domain
   name becomes useless for email.

   Their ratio may even hit the rail (come within E of the max
   limit), in which case, we'll even stop checking their SPF
   records, and save that bandwidth.

C. myvanity.com - publishes with -all

   They will be treated as per their own merit, as they have an
   "myvanity.com" entry in our stat table. When they send ham,
   the ratio goes up, when they send spam, ratio goes down. An
   honest domain will have no problem keeping a favourable SA
   score.

   The forged myvanity.com will not affect their rating, as that
   mail will not result in PASS. Only PASS-ed email will
   contribute to myvanity.com entry in the stat table.

D. cisco.com - publishes with -all

   As a serious company, they do not want any forgeries with the
   cisco.com name.

   Also, they do not (directly) send advertising to anyone. The
   employees are under tight control and do not send spam out.

   So, the email stream that comes out of this place is very
   clean.

   Therefore, the ham/spam ratio is close to the roof. Email from
   here will have no problem getting through SpamAssassin
   reliably, aided by a very favourable score.

   Forged @cisco will be bounced.

E. bigger-longer.com - publishes with -all and all its email is a 'PASS'

        They start with a 0 rating, and pretty soon SpamAssassin
        figures out that I'm not really interested in enlargments, so
        the ham/spam ratio goes down really fast on this one.
        
        Pretty soon, if not immediately, they hit the E roof, and the
        mail will be blocked at SMTP, without even checking its SPF
        record any more.



About the global scores (neutral/none/softfail)
===============================================

Currently most domains resolve in none. Also, 80% of the email
with 'none' is spam. So the global 'none' ham/spam ratio will be
very low. probably a +4 (80% of the range).

This means that all viagra jokes will be filtered out if the
sender doesn't publish an SPF record. All other mail which is
carefully worded, might go through ok, but certainly not
reliably. This is no worse than we have today.

If you publish an SPF record, and you show consistent PASS and no
spam from that domain name, your rating automatically goes up,
and all your viagra jokes go through as if the Assassin were
sleeping. A welcome change.

Maybe the spammers figure out that publishing a neutral SPF
yields the best delivery results. So, automatically all mail
systems in the world will learn this, and neutral will no longer
be a good record type to have. (v=spf1 a:myvanity.com ?all) may
still be ok, as it means that when you send from myvanity.com
servers you enjoy the benefits of your own reputation. When you
send from elsewhere, well, it will be a shot in the dark. You
should really get your mail infrastructure figured out in that
case.

Similarly, the scores associated with neutral, none, and softfail
will automagically adjust world wide, based on what the spammers
do.

This means that the only reliable way to have your mail treated
consistently, is to make the effort to get PASS. In that case,
you'll be at the mercy of your own reputation.

You may have noticed that the statistics database acts as both a
whitelist and a blacklist.

The Future
==========
What I like best about this auto-system, is that since PASS is
really the only result that every one will strive for, it will be
easy to separate the wheat from the chaff, based on the
reputation they earn for themselves.



Reputation-lookup services
==============================
Also, it may be expensive for everyone to maintain their own
database, so several services to offer the scoring system may
spring up. Their policy on accepting feeback will dictate the
quality of their data. For instance, a service that accepts any
feedback without scrutiny will not be worth much, as it can be
easily poisoned. But a service that only allows feedback from
reputable and big companies will likely have high quality data,
as the likelyhood of poisoning will be low.

In this case, the main differentiator of these services will be
who they use as rating agents. I'd be perfectly happy with a
service that uses cisco.com's feedback, but I don't care for one
where Dick and Harry can have a say in.

Even the N parameter is decided not by the reputation service,
but by the MTA operator. So even though two sites might use the
same service, they calculate the SA scores locally using their
own chosen N parameter and the statistics provided by the
service. Even the service does not know what the individual users
will do, it only takes into account the reputation agents'
opinion.


Impartiality
============

I used SpamAssassin in my writing, but you may have noticed that
the method is generic. Just replace SA with {generic filter} and
the usefulness of the method is unchanged.

The filter could even be a content-based filter, a challenge-
response system, or anything else that can take as input a piece
of email and a score and calculate a deliver/reject verdict.

By MESSAGE, I do not mean message body, but as much information
that the filter requires. For instance, a C/R system that
operates only on the envelope, will be fed with the score and the
envelope contents, and it will be expected to decide before DATA.

A content-based filter will have to look at the data. But in the
case of a consistent spammer domain, the score may eventually
reach (MAX-E) such that further email will be blocked just based
on the envelope MAIL FROM.

Back to the Future
==================

Similarly, there may be a time (in a couple of years) when even
the global 'neutral' is tainted enough that it earn a +4.9 score,
and in that case, it will be scrutinized very thoroughly.

Note that only "PASS" results may eventually yield to rejection
without an SPF check (read above, it *is* possible). All the
other mechanisms will result in SPF evaluation, even if say the
global 'neutral' consensus gets to the +4.9999 mark (which is
less than E away from the maximum of +5), it will still go
through the SPAM filter.

An indirect result will be (in a few years): most email that does
not result in a PASS is likely spam. Some of the PASS email will
also be spam. So, it only makes sense that a 10-query SPF record
that results in neutral is just a waste of bandwidth. In that
case, as an SPF checker I might not even want to evaluate SPF
completely, unless it looks like it can result in PASS.

So, unless it might result in a pass, it'd be better to skip SPF
checks in that context. This is why it is important for the m=
mask to provide info during the first record which will make it
clear how likely it is for the SPF check to result in a PASS.

This way we've come full circle to the 1-query optimization. It
wasn't my intention, but it seems to me to be the obvious
direction we should be heading towards.

So I think that if we could set up the email world the way I've
painted it, with the appropriate adjustments, of course, then we
can solve the spam problem for good in only a few years.

Somewhere along the way, mail through forwarders will result in
softfail, and will be treated very unfavourably. This means that
forwarding technology will likely be abandoned if it does not
adapt. The SRS solution may be just what is needed.


Regards,
Radu.