Re: How fragile is SPF ?



----- Original Message ----- 
From: "Gordon Fecyk" <gordonf(_at_)pan-am(_dot_)ca>
To: <ietf-mxcomp(_at_)imc(_dot_)org>
Sent: Tuesday, June 29, 2004 11:03 PM
Subject: RE: How fragile is SPF ?

Good old fashioned testing.

Actually, by this time AOL, AltaVista.com etc must have some
hard data on its effects.  It's been at least four months.
Numbers, anyone?


We have 7-8 months of operational experiences with stored logs.  About 1000+
installations by this
point.  It was a low-key release. In the first few months I was collecting
field tester logs to compare results.   My own support system stats can be
seen at http://www.winserver.com/antispam.   I can zip up logs for anyone if
you wish to run their own simulator or reporter.  They are all very verbose
and detailed.

Here is my summary points on SPF and overhead issues.

MARID can not be a complete open ended lookup system if only for one reason:
There is no guarantee of an effective database availability.   This is
obviously the case during the early stages.    But another more obvious
reason is that most of the spam is going to be no result or NXDOMAIN.

I see four (4) possible future scenarios for spammers when MARID is finally
in place:

1) Spammer will comply - Good Spammer, CANSPAM compliant.
2) Spammer will ignore it -  Question of legal status unknown.
3) Spammer will SPOOF SPF -  Malicious illegal Entry - US ECPA violation
4) Spammers/hacker will overload it - obvious crime. Call the FBI!

Of course, the benefit I (and most people) would be looking for is:

5) Spammers are reduced. (Go out of business? Stop trying?)

So what I am seeing with my 7-8 months of accumulated data?

I need to write some code to get these specific totals, but it is pretty
obvious what the trend is.

- Hardly any will comply. A few added legit SPF records.
- By far, most will ignore it.
- Some are spoofing SPF relaxed policy domains (see comment about
Neutral/Softail)
- 4-6 days with non-stop connections, reaching 100K or more per day.

and finally when there were weeks or some patterns where the daily
connections were on a down curve, it would PICK right up and by far, the
totals were pretty steady in fact, there were weeks where the daily totals
differ by less than 1%   In the last few months, what have been a steady
near 2500 daily total, it is now up to an average 5500 for the month of
June.

So does it reduce spammers?

Unfortunately I can't say that it does, atleast not by what I see.  But I
believe most people were aware of this and realized the goal in any tools
added to control it was to reduce the junk collected to a bare minimum or
zilch.   We have about a 90-94% rejection rate, so by far, the customer base
is extremely happy.

My main design work focused was to reduced any DNS lookup required and do
add a 2821 rejection IP related concept that further reduced the need to do
a final CBV check to see if the return address is acceptable.

This might be out of scope but if we concern about SPF abuse and/or DNS
overhead/attacks, etc, then it needs to stated that the SMTP servers need to
do more to reduce the MARID/DNS requirement in the first place. Its that
plain any simple and I believe 100% most implementators will finally see
this one they get going with this, like I have.  Our SMTP was pretty much a
standard system like all others. Didn't have all the stuff it has now until
we started to do all this extra checking with DNS.   That is why I said it
can't be such an open ended equation with the desire to provide the same
results we are looking for.   Something has to give and I do believe,
eventually, others will see that (if not already).

One thing SMTP developers can do is to first is to enforce SMTP compliancy.

One simple check is the HELO syntax.  Go figure, by doing a simple domain
literal check,  you can knock out 10-12% of the spammers!  No DNS lookup
required.

Another easy item addresses the BULK spammers which is where I get most of
my connections (and spam attacks).

BULK Spammers need to optimize their throughput too.  So they use dumb
streaming SMTP clients, possible perl or php scripts that don't support
multi-line responses.  MARID operators should probably be adding
ECPA/CAMSPAM or Local nation legal compliant System Policy statement at the
Welcome/Greeting.  Believe it or not, this will eliminate atleast 40% of
these dumb bulk spammers.  They can't handle the perfectly acceptable SMTP
compliant Multi-Line Greeting.

Have spammers learned?

Well, I thought they would!  But I continue to see it.  I think we are
giving more credit then they deserve.  But someone brought up a good point.
Once everyone applies this idea, then maybe spammers will adjust.

I 100% agree, but isn't this good? We want spammers to change!  Once they
see what is going on,  some will begin to make the effort to comply with
other SPAM related efforts. Currently, they don't have the incentive to
change.

And the final SMTP improvement comes straight from RFC 2821 - "Section 3.3
Mail Transactions" should not be ignored.

I believe one of the absence MARID parameters is RCPT TO.   RCPT TO
validation should be performed by SMTP.  This saves us another 30-35% of DNS
lookups requirements and it makes sense to do an RCPT TO validation.

And it makes sense too!

MARID is really only for an anonymous sender transactions for final
destination mail.

A route is by traditional SMTP standards allowed only for authenticated
senders using traditional SMTP methods  (SMTP AUTH, IP allow relay tables,
POPB4SMTP).

So for routed mail (RCPT TO is not local),  authentication or trust is
already required by SMTP  thus nullifying any need for LMAP or DNS related
sender validation.

So doing more at SMTP will clearly reduce much of the MARID overhead related
issues.  Simply by following the above SMTP level advance checking, you can
reduce atleast 70-75% of your MARID lookup needs.   You can't ignore it or
try to solve this using an open ended MARID/DNS lookup.

As far as the SPF malicious abuse, there is not much you can do without
following the specs.  Of course, you implementation needs to be bug free of
buffer exploits.  But I have not seen any exploitation in these area,
atleast not malicious.  Ironically, I did see it from "real systems" in the
form of not having a proper setup or just having big records.  I seen one
policy like so:

        v=spf1 x.y.0.0/32 +mx -all

where the x.y is literally there.

By overall,  as far as SPF concerns for me there are 2 things:

o Macro Expansion:

if there is one SPF feature that standouts is the macro expansion thing.
This was the last part implemented for us.  This could be another argument
against it that keeps it from being a straight forward easy implementation.
i.e,  new implementers who need to use a 3rd party solution as oppose to
keeping it in house.    So if there is one area that might not be done
correctly when done in-house, it could be the macro expansion.

o Softfail/Neutral

This is what I see happening, however, not at high levels.

If I was the author,  I would of made it very clear in the SPF specification
that the SoftFail/Neutral relaxed fallback should be viewed as a Temporary
option with an inherent Expiration or Time Limit (i.e, 6 months?)

In my opinion, this will probably be the one area a SPF spoofing spammer
will make itself look compliant, but use a relaxed result policy.  It also
offers spammers a way to look at other SPF domains with relaxed policies.

My recommendation is that a SPF client seeing a SoftFail/Neutral should
record the first time usage of this record and then place a time limit on
its continue usage.

I see this softfail/neutral concept as something that starts a new
specification with a loophole spammers can use.   I understands the reasons
for it, but it should be coupled with some
enforcement expiration or time limit.  AOL.COM should not be using it
forever.  At some point, they need to change that to a FAIL.

-- 
Hector Santos, Santronics Software, Inc.
http://www.santronics.com