spf-discuss
[Top] [All Lists]

Re: Re: DNS load research

2005-03-21 12:01:49
Andy Bakun wrote:
Have we fully explored different weights for each mechanism based on
what kind of DNS load they exhibit?

Not by a longshot.

We are kind of at an stalemate here.  Radu wants everyone to have a
"zero load SPF record", which is actually a valid goal (and good
buzz-phrase).  Others don't want to suggest that people not use the
features of SPF that actually make it SPF because those features have
legit uses (zero load SPF records are just RMX in disguise).  There is a
middle ground.

I would say 'low load' or 'minimum load'. I understand that zero is not feasible.

These numbers below are just an example.  I chose these weights based on
my understanding of how expensive each one is to DNS, how likely it is
that the MTA would do that anyway, and if the result will is logically
cachable; my understanding may be flawed.

        mechanism/modifier  |  weight
                all         |    0
a | 2 mx | 1
                ptr         |    2
                ip4         |    0
                ip6         |    0
              include       |    1
              exists        |    3
              redirect      |    2

        The baseline is 2.  I've given mx 1 because the MTA needs to
        look this up anyway, so it's a lookup, but it's cheaper than
        other queries that the MTA might not need to do without SPF
        (although, many MTAs do a and ptr lookups, but that's not
        required to accept mail, and may be less required when SPF sees
        significant deployment).

This is not quite correct. When you get incoming spam, you have too look up MX records for domains that otherwise would have no reason to look up, as you don't correspond with them.

The MX is an indirect mech. Every time you see MX, be ready for at least two queries. One to get the list of MX mailers, and at least one to get the A record of the first mailer. Also, when you see MX, you have no idea how many lookups it will take to get to the bottom of it. You have to do one query to find out.

For these reasons, the MX mechanism is at least twice as expensive as an A mechanism.

Also, MX mechanisms really are worthless, but expensive:

When you list an MX mechanism, there can be two possible scenarios:

1. You control that MX mechanism.
   So you know all the mailers, and you should list them (by IP :) )

2. You don't control it, it's in someone else's domain.
   So you don't know the mailers, and you're guessing.
   You've a better chance of guessing wrong than right, as in many
   installations, outgoing mail goes through different servers than
   incoming (which is what MX is for).
   We used t-online.de as an example of this guess-work.

I'm not sure what the total allowed weight should be before returning
PermError, but I don't see any problem with using Wayne's current
values.  Again, larger limits make the hard things (complex setups)
possible.  But people need to realize that their setup is complex, as a
way to drive change.  Weighing the SPF record like this could be a step
in that direction.

I believe it's not the mechanisms themselves that have the highest load potential. We'd have to look at what is cacheable and what isn't to get a picture of where the load really is.

In the following, I will keep in mind a scenario with 50 zombies, each sending me mail forged to show 200 different users (randomly generated names, like sldkjsfoiu(_at_)yahoo(_dot_)com) @ the same set of 50 different domains (4 random users per domain). (Ie, the zombies have the same mailing list to send to). Assume that each of the 50 domains uses a mechanism like shown below; Assume the the random algorithm is the same on all zombies, and it generates the same sequence of 200 unique usernames.

This scenario is one I see daily in my server logs. Fortunately not everyone is publishing SPF yet. If/when everyone does publish SPF, the costs below become much closer to reality than they are today:

Mechanisms like A:domain.com will cost 1 query across the internet, and 49 hits to the cache. Grand total: 50 queries to the internet, 50*(50*200 - 1) to the cache.

A mechanism that uses the %{d} (domain name) macro will be easily cacheable, so that will cost also 1 query on the internet and 49 to the cache. Grand total : 50 queries to the internet, 50*(50*200 - 1) to the cache.

A mech that uses %{i} (IP address) will not be so easily cacheable, and the grand total cost = number of domains * number of IPs (50*50) queries to the internet, and 50*50*(200-1) to the cache.

A mech that uses %{l} (user name of sender) will also not be easily cacheable, and will result in 50*200 queries to the internet, and 50*200*(50-1) queries to the cache.

A mech that uses both the %{i} and the %{l} or %{s} mechanisms (like altavista.com - +exists:CL.%{i}.FR.%{s}.HE.%{h}.null.spf.altavista.com) will cost a full 50*50*200 queries to the internet. Pray that the zombies don't forge altavista, or you'll get aquainted with the wrath of SPF.

I have not mentioned this before, because these macros are the "complex setups" that make hard things possible. This was one of the fundamental reasons for SPF's existance, so I'll let it be.

The DNS limit however acts as an amplifier to the costs of the macros. So it should be kept as low as possible.

If the goal is to have everyone publish SPF, we must deal with that scenario. How much will it cost our DNS infrastructure if everyone published SPF ? Currently only a small percentage do, and looking at the traffic numbers I don't like where they are headed.

I think you ask a great question, but to answer it well it would require a lot more research and thought.

Regards,
Radu.




-------
Sender Policy Framework: http://spf.pobox.com/
Archives at http://archives.listbox.com/spf-discuss/current/
Read the whitepaper!  http://spf.pobox.com/whitepaper.pdf
To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?listname=spf-discuss(_at_)v2(_dot_)listbox(_dot_)com

Attachment: radu.vcf
Description: Vcard

<Prev in Thread] Current Thread [Next in Thread>