Need for Complexity in SPF Records

Radu, I wrote this response yesterday, then today decided it doesn't soundquite right. I'm really not as sure of what I'm saying as it sounds. Showme I'm wrong, and I'll re-double my efforts to find solutions that don'tabandon what is already in SPF, solutions like your maskmodifier. Examples are the best way to do that. Your example.com below isalmost there, but it still doesn't tell me why we really need exists andredirect.


At 07:21 PM 3/26/2005 -0500, Radu wrote:

David MacQuigg wrote:
At 04:06 PM 3/26/2005 -0500, Radu wrote:
David MacQuigg wrote:
Now I'm confused. If the reason for masks is *not* to avoid sendingmultiple packets, and *only* to avoid processing mechanisms thatrequire another lookup, why do we need these lookups on the clientside? Why can't the compiler do whatever lookups the client would do,and make the clients job as simple as possible?
Sorry for creating confusion.

Say that you have a policy that compiles to 1500 bytes.

The compiler will split it into 4 records, about 400-bytes each or so.

example.com     IN TXT \
     "v=spf1 exists:{i}.{d} ip4:... redirect=_s1.{d2} m=-65/8 m=24/8"
_s1.example.com IN TXT "v=spf1 ip4:.... .... ....  redirect=_s2.{d2}"
_s2.example.com IN TXT "v=spf1 ip4:.... .... ....  redirect=_s3.{d2}"
_s3.example.com IN TXT "v=spf1 ip4:.... .... ....  -all"

We want the mask to be applied after the exists:{i}.{d}. Since that
mechanism was in the initial query, cannot be expanded to a list of IPs
the mask cannot possibly apply to it.
I think what you are saying is that the compiler can't get this down to asimple list of IPs, because we need redirects containing macros thatdepend on information only the client has. So if we are to put theburden of complex SPF evaluations on the server side, where it belongs,seems like we have to pass all the necessary information to the server inthe initial query. We already pass the domain name. Adding the IPaddress should not be a big burden, and it would have some other benefitswe discussed.
If you can find a way to do that and still keep the query cacheable, letme know. If it is compatible with the way DNS works currently, I'll evenlisten and pay attention. ;)
That 1 UDP packet might not seem like a lot. But currently it is cacheableand most of the time is not even seen on the internet. Making ituncacheable would be a multiple fold burden on bandwidth. That's exactlywhy caching and the TTL mechanism was invented, and now you suggest wegive it up?

No, I see your point. If we truly need %{i} macros, and we evaluate them onthe server side, that would produce a different response record for everyIP address, and it might not make sense to cache such records. Responsesfor SPF records with no %{i} macros would cache as always. The %{d} macroswould not impair caching. Even the %{i} responses might be worth cachingfor a few minutes, if you are getting hammered by one IP.

Whether the loss of caching on a few records is too high a price depends onthe severity of the threatened abuse. Should we tolerate a small increasein DNS load for the normal flow of email, to limit the worst-case abuse ofthe %{i} macro. I don't know.

What I *would* do is discourage the widespread use of macros, redirects,and includes, and state in the standard that processing of records withthese features SHOULD be lower priority than processing simplerecords. That may help to implement a defense mode if these features areabused.

Maybe I'm just not seeing the necessity of setups like the aboveexample.com. I'm sure someone could come up with a scenario where itwould be real nice if all SPF checkers could run a Perl script embeddedin an SPF record, but we have to ask, is that really necessary to verifya domain name?
The "..." imply a list of ip4: mechanism that is 400-bytes long. That'swhy the chaining is necessary. ebay.com has something like that.hotmail.com uses something similar too. When you have lots of outgoingservers, you need more space to list them, no?

Why can't they make each group of servers a sub-domain with its own simpleDNS records, as rr.com has done with its subdomains? _s3.example.com canhave as many servers as can be listed in a 400 byte SPF record, and thatincludes some racks with hundreds of servers listed in one 20 byte piece ofthe 400 byte record. With normal clustering of addresses, I would thinkyou could list thousands of servers in each subdomain, with nothing butip4's in the SPF record.

As I understand it, users sending mail from _s3.example.com will still see'example.com' in their headers, but the envelope address will be the realone _s3.example.com. That's the one that needs to authenticate, and theone that will inherit its reputation from example.com.

Seems to me this is using DNS exactly the way it was intended, distributingthe data out to the lowest levels, and avoiding the need to constructhierarchies within the SPF records. Sure, it can be done, but what is theadvantage over just putting simple records at the lowest levels, andletting DNS take care of the hierarchy? Why does ebay.com need four levelsof hierarchy in its SPF records?

If we simply can't sell SPF without all these whiz-bang features, I wouldsay put it *all* on the server side. All the client should have to do isask - "Hey <domain> is this <ip> OK?" We dropped that idea because itdoesn't allow caching on the client side, but with a simple PASS/FAILresponse, the cost of no caching is only one UDP round trip peremail. This seems like small change compared to worries about runawayredirects, malicious macros, etc.
I'll humour you:
This server-side processing would not be happening on a caching server,correct? That would not save anything. I hope you agree.

If the caching server were in the domain which created the expensive SPFrecord, then it would save traffic to and from the client, at the expenseof traffic within the domain that deserves it. If example.com needs 100queries within their network to answer my simple query "Is this <ip> OK?",then they need to think about how to better organize their records. All Ineed is a simple PASS/FAIL, or preferably a list of IP blocks that I cancache to avoid future queries. ( This should be the server's choice.)

What I *don't* want in answer to my simple query, is a complex script torun every time I have a similar query. That seems to be the fundamentalsource of our problem. SPF needs to focus on its core mission,authenticating domain names, and doing just that as efficiently andsecurely as possible. All these complex features seem to be motivated by adesire to provide functionality beyond the core mission - constructing DNSblock lists, etc. Now we are finding that the complex features are notonly slowing us down, but have opened up some unnecessary vulnerabilitiesto abuse.

So the only place where it might make a difference is if the evaluationwas run on the authoritative server for the domain.
The problem with that, is that authoritative servers are designed withperformance and reliability in mind (as opposed to caching servers, whichcare more about cost savings). As such, the auth servers *do not* dorecursive queries, as an SPF record evaluation might be. They also do notdo any caching. They respond to every query with a piece of data theyalready have in memory or on disk. If they don't have that piece ofinformation, they return an empty response or "it doesn't exist"(NXDOMAIN). They never look for it anywhere else. That's why they areauthoritative. If they don't know about it, it doesn't exist.
Now, the spfcompiler only makes sense if it is running on a master server.Itself the master for a zone is authoritative. The above authoritativeservers are slaves. They take the information from the master server anddisseminate it as if it was their own. It is the adminstrator of themaster zone server that allows them to do so. No other server can respondauthoritatively to queries for the zone in question.
So, the only place the spf compiler makes sense is on the master server,because ultimately, it is the only one who really knows the facts. Whenthe facts change, the master informs the slaves, which do a zone transferin order to update their databases. So the truth propagates nearlyinstantly from the master to the slaves, and as such the slaves can beseen as clones of the master, identical in every way, except for the 3second delay it takes them to transfer the zone files.
You cannot run the compiler on the slaves, because they might eachevaluate the record differently, as they are coping with different networkconditions (such as lost packets, etc). In that case, they would each tella different "truth" than each other and than the master server. In thatcase they would no longer be authoritative.
Now, having the master zone server respond to queries that require it todo calculations of any kind is an absolute no-no. That is because nomatter how big the zone is (yahoo, rr, whatever), there is only onemaster. Ok, there may be a couple, but their job is not to respond toqueries, but to 'hold the authority'. The slaves are for responding to queries.

I would also say the slaves are the right machines on which to do whatevercomplex lookups are needed to answer a query. The owners of those machinesare the only ones who will make the tradeoff of cost vs desired complexity.

So doing what you propose would require the DNS system to be turned upsidedown. The justification of SPF is just not good enough.

I don't see how this turns anything upside down. DNS is supposed to bedecentralized. If complex lookups are necessary, having a bunch of slaveservers do the work on behalf of a master server is consistent withdecentralization.

Let's estimate the worst-case load on DNS if we say "no lookups, one packetonly in any response". I'm guessing 90% of domains will provide a simple,compiled, cachable, list of IP blocks. This is as good as it gets, withthe possible exception of a fallback to TCP if the packet is too long. The10% with really complex policies may have a big burden from queries andcomputations within their own network, but what goes across the Internet isa simple UDP packet with a PASS or FAIL.

That response is not cacheable, but lets compare the added load to someother things that happen with each email. Setting up a TCP connection is aminimum of three packets. SMTP takes two packets for the HELO andresponse. MAIL FROM is another two. Then we need two for theauthentication. At that point we can send a reject (one packet) andterminate the connection (4 packets).

Looks to me like the additional load on DNS is insignificant for normalmail, and only a few percent of the minimum traffic per email in a DoSstorm. Also, the additional load is primarily on the domain with theexpensive SPF records, where it should be. Even if this were a spammerdomain, and they weren't *really* doing any internal lookups, the load ontheir DNS server is two packets for every additional two-packet load on thevitims. No amplification factor here.

How about this: All SPF records SHOULD be compiled down to a list ofIPs. If you need more than that, then do as much as you like, but givethe client a simple PASS or FAIL. Most domains will then say "Here isour list of IPs. Don't ask again for X hours." Only a few will say "Ourpolicy is so complex, you can't possibly understand it. Send us every IPyou want checked."
That's exactly what the exists:{i}.domain does. It tells the domain aboutevery IP it wants checked, and the server checks it. Unfortunately, it isextremely expensive because it's AGAU.


If I were writing an SPF-doom virus, this is where I would start.

I need to get back to designing ICs. :>)


Nah... you've got some great ideas and I value your contribution and feedback.

And I appreciate your time in getting me up to speed on these problems. Ihope one day I can return the favor.


-- Dave
************************************************************     *
* David MacQuigg, PhD      email:  dmquigg-spf at yahoo.com      *  *
* IC Design Engineer            phone:  USA 520-721-4583      *  *  *
* Analog Design Methodologies                                 *  *  *
*                                   9320 East Mikelyn Lane     * * *
* VRS Consulting, P.C.              Tucson, Arizona 85710        *

************************************************************ *