Re: What to do about redirect= and NXDOMAIN?

IMHO, the definitions of include: and redirect= results on "problem"
domains, (nonexisting domains or existing domains with no spf record),
should cause compliant receivers to do something that would in general
map back to the definitions of what would happen to spf evaluations on
the same domain.

This is in accordance with the principle of least surprise.  

(Isn't it convenient when you can make a hand-waving argument based on
an abstract principle that happens to agree with what you think makes
sense?  :-)  )

Anyway, by this argument the result matrix should be:

             |     non-existent domain      | domain w/o SPF rec
  -----------+------------------------------+----------------------------------
   include:  | according to NXDOMAIN result | according to No-SPF-record result
   redirect= | according to NXDOMAIN result | according to No-SPF-record result

Since by definition:

  NXDOMAIN result:  spf(domain that doesn't exist) == None
    No SPF record:  spf(existing domain w/o spf record) == None

The above chart would translate to:

             | non-existent domain | domain w/o SPF record
  -----------+---------------------+-----------------------
   include:  | None->not match     | None->not match
   redirect= | None                | None

However, there's one minor annoyance with the above scheme which I think
can be entirely answered:

There's always been a bit of a disagreement as to what would be the best
definition of spf(non-existent domain).

The current defined result is "None".  The logic behind that
decision/compromise as I understand it is that:

  One camp:  Some want FAIL or PermError to be the defined
             result so they can immediately reject.

  Another camp:  Some want it to be defined to be "None",
                 since there really isn't an spf policy
                 published.

  Compromise:  Since most people already reject on a
                nonexistent mailfrom domain anyway, the
                issues group (1) care about are already
                taken care of, so the technically pure
                (2) answer is still a safe answer for
                that group.  This means that making the
                definition conform to (2)'s wishes
                still placates both groups.

That works well for top-level spf evaluations, but the compromise isn't
quite as clear for recursive-type evaluations, because receivers aren't
already handling the issue in a way that's conveniently compatible with
current/future spf practice.

Now, at this point I have to point out that I'm of the strong opinion
that spf records that are unquestionably "bad" in some ways, with syntax
errors or requiring too many queries or the like should return
PermError, *and* that PermError should result in a 5xx reject.

So I can understand the logic in saying that perhaps including a
nonexistent domain should cause a PermError, thus potentially causing
all mail sent with a mailfrom that domain to be rejected, when combined
with the argument that in most cases this sort of thing is an error on
the part of the writer of the spf record.

However, it's entirely possible that a receiver could legitimately
include a nonexistent domain.  Say they're switching their email setup
around, and to be safe include: domains in a top-level spf record for a
time period slightly past the time in which the domains themselves are
defined.  

Should we cause the domain owners to suffer if they do that, based on a
desire to make sure the typos other domain owners make get fixed fast?

As much as I'm normally in favor of errors showing up early and loudly,
I don't think that logic holds when there is any justification for
having a seemingly-bad record.

So in my mind, as long as an spf evaluation of a nonexistent domain
results in "none", then the table should still be:

             | non-existent domain | domain w/o SPF record
  -----------+---------------------+-----------------------
   include:  |     not match       | not match
   redirect= |     None            | None

The downside here is that folks who enter records such as:

 v=spf1 a mx include:nonexistent-domain -all
 or
 v=spf1 a mx redirect:nonexistent-domain

will just have to deal with the more difficult troubleshooting caused by
their include or redirect modifiers going "invisible" if the referred-to
domain's records disappear.

There is an advantage in this:  People in a large organization that's
putting together spf records can "include:" other records ahead of time,
instead of having to be *VERY* careful to fill out
leaves-before-branches, so to speak.

The fact that there's a deployment advantage to doing something that is
internally consistent to begin with, is IMHO just another reason to use
that internal consistency argument.

(And as a side note, I think this leave's PermError's now only showing
up for actual syntax errors and unknown mechanisms, something far more
reject-able than the not-necessarily-an-error cases above.)

-- 
Mark Shewmaker
mark(_at_)primefactor(_dot_)com
770-933-3250