Need for Complexity in SPF Records
2005-03-27 14:36:51
Radu, I wrote this response yesterday, then today decided it doesn't sound
quite right. I'm really not as sure of what I'm saying as it sounds. Show
me I'm wrong, and I'll re-double my efforts to find solutions that don't
abandon what is already in SPF, solutions like your mask
modifier. Examples are the best way to do that. Your example.com below is
almost there, but it still doesn't tell me why we really need exists and
redirect.
At 07:21 PM 3/26/2005 -0500, Radu wrote:
David MacQuigg wrote:
At 04:06 PM 3/26/2005 -0500, Radu wrote:
David MacQuigg wrote:
Now I'm confused. If the reason for masks is *not* to avoid sending
multiple packets, and *only* to avoid processing mechanisms that
require another lookup, why do we need these lookups on the client
side? Why can't the compiler do whatever lookups the client would do,
and make the clients job as simple as possible?
Sorry for creating confusion.
Say that you have a policy that compiles to 1500 bytes.
The compiler will split it into 4 records, about 400-bytes each or so.
example.com IN TXT \
"v=spf1 exists:{i}.{d} ip4:... redirect=_s1.{d2} m=-65/8 m=24/8"
_s1.example.com IN TXT "v=spf1 ip4:.... .... .... redirect=_s2.{d2}"
_s2.example.com IN TXT "v=spf1 ip4:.... .... .... redirect=_s3.{d2}"
_s3.example.com IN TXT "v=spf1 ip4:.... .... .... -all"
We want the mask to be applied after the exists:{i}.{d}. Since that
mechanism was in the initial query, cannot be expanded to a list of IPs
the mask cannot possibly apply to it.
I think what you are saying is that the compiler can't get this down to a
simple list of IPs, because we need redirects containing macros that
depend on information only the client has. So if we are to put the
burden of complex SPF evaluations on the server side, where it belongs,
seems like we have to pass all the necessary information to the server in
the initial query. We already pass the domain name. Adding the IP
address should not be a big burden, and it would have some other benefits
we discussed.
If you can find a way to do that and still keep the query cacheable, let
me know. If it is compatible with the way DNS works currently, I'll even
listen and pay attention. ;)
That 1 UDP packet might not seem like a lot. But currently it is cacheable
and most of the time is not even seen on the internet. Making it
uncacheable would be a multiple fold burden on bandwidth. That's exactly
why caching and the TTL mechanism was invented, and now you suggest we
give it up?
No, I see your point. If we truly need %{i} macros, and we evaluate them on
the server side, that would produce a different response record for every
IP address, and it might not make sense to cache such records. Responses
for SPF records with no %{i} macros would cache as always. The %{d} macros
would not impair caching. Even the %{i} responses might be worth caching
for a few minutes, if you are getting hammered by one IP.
Whether the loss of caching on a few records is too high a price depends on
the severity of the threatened abuse. Should we tolerate a small increase
in DNS load for the normal flow of email, to limit the worst-case abuse of
the %{i} macro. I don't know.
What I *would* do is discourage the widespread use of macros, redirects,
and includes, and state in the standard that processing of records with
these features SHOULD be lower priority than processing simple
records. That may help to implement a defense mode if these features are
abused.
Maybe I'm just not seeing the necessity of setups like the above
example.com. I'm sure someone could come up with a scenario where it
would be real nice if all SPF checkers could run a Perl script embedded
in an SPF record, but we have to ask, is that really necessary to verify
a domain name?
The "..." imply a list of ip4: mechanism that is 400-bytes long. That's
why the chaining is necessary. ebay.com has something like that.
hotmail.com uses something similar too. When you have lots of outgoing
servers, you need more space to list them, no?
Why can't they make each group of servers a sub-domain with its own simple
DNS records, as rr.com has done with its subdomains? _s3.example.com can
have as many servers as can be listed in a 400 byte SPF record, and that
includes some racks with hundreds of servers listed in one 20 byte piece of
the 400 byte record. With normal clustering of addresses, I would think
you could list thousands of servers in each subdomain, with nothing but
ip4's in the SPF record.
As I understand it, users sending mail from _s3.example.com will still see
'example.com' in their headers, but the envelope address will be the real
one _s3.example.com. That's the one that needs to authenticate, and the
one that will inherit its reputation from example.com.
Seems to me this is using DNS exactly the way it was intended, distributing
the data out to the lowest levels, and avoiding the need to construct
hierarchies within the SPF records. Sure, it can be done, but what is the
advantage over just putting simple records at the lowest levels, and
letting DNS take care of the hierarchy? Why does ebay.com need four levels
of hierarchy in its SPF records?
If we simply can't sell SPF without all these whiz-bang features, I would
say put it *all* on the server side. All the client should have to do is
ask - "Hey <domain> is this <ip> OK?" We dropped that idea because it
doesn't allow caching on the client side, but with a simple PASS/FAIL
response, the cost of no caching is only one UDP round trip per
email. This seems like small change compared to worries about runaway
redirects, malicious macros, etc.
I'll humour you:
This server-side processing would not be happening on a caching server,
correct? That would not save anything. I hope you agree.
If the caching server were in the domain which created the expensive SPF
record, then it would save traffic to and from the client, at the expense
of traffic within the domain that deserves it. If example.com needs 100
queries within their network to answer my simple query "Is this <ip> OK?",
then they need to think about how to better organize their records. All I
need is a simple PASS/FAIL, or preferably a list of IP blocks that I can
cache to avoid future queries. ( This should be the server's choice.)
What I *don't* want in answer to my simple query, is a complex script to
run every time I have a similar query. That seems to be the fundamental
source of our problem. SPF needs to focus on its core mission,
authenticating domain names, and doing just that as efficiently and
securely as possible. All these complex features seem to be motivated by a
desire to provide functionality beyond the core mission - constructing DNS
block lists, etc. Now we are finding that the complex features are not
only slowing us down, but have opened up some unnecessary vulnerabilities
to abuse.
So the only place where it might make a difference is if the evaluation
was run on the authoritative server for the domain.
The problem with that, is that authoritative servers are designed with
performance and reliability in mind (as opposed to caching servers, which
care more about cost savings). As such, the auth servers *do not* do
recursive queries, as an SPF record evaluation might be. They also do not
do any caching. They respond to every query with a piece of data they
already have in memory or on disk. If they don't have that piece of
information, they return an empty response or "it doesn't exist"
(NXDOMAIN). They never look for it anywhere else. That's why they are
authoritative. If they don't know about it, it doesn't exist.
Now, the spfcompiler only makes sense if it is running on a master server.
Itself the master for a zone is authoritative. The above authoritative
servers are slaves. They take the information from the master server and
disseminate it as if it was their own. It is the adminstrator of the
master zone server that allows them to do so. No other server can respond
authoritatively to queries for the zone in question.
So, the only place the spf compiler makes sense is on the master server,
because ultimately, it is the only one who really knows the facts. When
the facts change, the master informs the slaves, which do a zone transfer
in order to update their databases. So the truth propagates nearly
instantly from the master to the slaves, and as such the slaves can be
seen as clones of the master, identical in every way, except for the 3
second delay it takes them to transfer the zone files.
You cannot run the compiler on the slaves, because they might each
evaluate the record differently, as they are coping with different network
conditions (such as lost packets, etc). In that case, they would each tell
a different "truth" than each other and than the master server. In that
case they would no longer be authoritative.
Now, having the master zone server respond to queries that require it to
do calculations of any kind is an absolute no-no. That is because no
matter how big the zone is (yahoo, rr, whatever), there is only one
master. Ok, there may be a couple, but their job is not to respond to
queries, but to 'hold the authority'. The slaves are for responding to queries.
I would also say the slaves are the right machines on which to do whatever
complex lookups are needed to answer a query. The owners of those machines
are the only ones who will make the tradeoff of cost vs desired complexity.
So doing what you propose would require the DNS system to be turned upside
down. The justification of SPF is just not good enough.
I don't see how this turns anything upside down. DNS is supposed to be
decentralized. If complex lookups are necessary, having a bunch of slave
servers do the work on behalf of a master server is consistent with
decentralization.
Let's estimate the worst-case load on DNS if we say "no lookups, one packet
only in any response". I'm guessing 90% of domains will provide a simple,
compiled, cachable, list of IP blocks. This is as good as it gets, with
the possible exception of a fallback to TCP if the packet is too long. The
10% with really complex policies may have a big burden from queries and
computations within their own network, but what goes across the Internet is
a simple UDP packet with a PASS or FAIL.
That response is not cacheable, but lets compare the added load to some
other things that happen with each email. Setting up a TCP connection is a
minimum of three packets. SMTP takes two packets for the HELO and
response. MAIL FROM is another two. Then we need two for the
authentication. At that point we can send a reject (one packet) and
terminate the connection (4 packets).
Looks to me like the additional load on DNS is insignificant for normal
mail, and only a few percent of the minimum traffic per email in a DoS
storm. Also, the additional load is primarily on the domain with the
expensive SPF records, where it should be. Even if this were a spammer
domain, and they weren't *really* doing any internal lookups, the load on
their DNS server is two packets for every additional two-packet load on the
vitims. No amplification factor here.
How about this: All SPF records SHOULD be compiled down to a list of
IPs. If you need more than that, then do as much as you like, but give
the client a simple PASS or FAIL. Most domains will then say "Here is
our list of IPs. Don't ask again for X hours." Only a few will say "Our
policy is so complex, you can't possibly understand it. Send us every IP
you want checked."
That's exactly what the exists:{i}.domain does. It tells the domain about
every IP it wants checked, and the server checks it. Unfortunately, it is
extremely expensive because it's AGAU.
If I were writing an SPF-doom virus, this is where I would start.
I need to get back to designing ICs. :>)
Nah... you've got some great ideas and I value your contribution and feedback.
And I appreciate your time in getting me up to speed on these problems. I
hope one day I can return the favor.
-- Dave
************************************************************ *
* David MacQuigg, PhD email: dmquigg-spf at yahoo.com * *
* IC Design Engineer phone: USA 520-721-4583 * * *
* Analog Design Methodologies * * *
* 9320 East Mikelyn Lane * * *
* VRS Consulting, P.C. Tucson, Arizona 85710 *
************************************************************ *
<Prev in Thread] |
Current Thread |
[Next in Thread>
|
- Re: Use of New Mask Mechanism, (continued)
- Re: Use of New Mask Mechanism, David MacQuigg
- Re: Use of New Mask Mechanism, Radu Hociung
- Re: Use of New Mask Mechanism, David MacQuigg
- Re: Use of New Mask Mechanism, Radu Hociung
- Re: Use of New Mask Mechanism, David MacQuigg
- Re: Use of New Mask Mechanism, Radu Hociung
- Re: Use of New Mask Mechanism, David MacQuigg
- Re: Use of New Mask Mechanism, Radu Hociung
- Need for Complexity in SPF Records,
David MacQuigg <=
- Re: Need for Complexity in SPF Records, Radu Hociung
- Re: Need for Complexity in SPF Records, David MacQuigg
- Re: Use of New Mask Mechanism, Andy Bakun
- Re: Use of New Mask Mechanism, Radu Hociung
- Re: Use of New Mask Mechanism, Frank Ellermann
- Re: Re: Use of New Mask Mechanism, Radu Hociung
- Re: Use of New Mask Mechanism, Frank Ellermann
- Re: Re: Use of New Mask Mechanism, Radu Hociung
- Re: Use of New Mask Mechanism, Frank Ellermann
- Re: Re: Use of New Mask Mechanism, Radu Hociung
|
|
|