Re: The (almost) final SPFv1 spec: draft-schlitt-spf-classic-01pre5


Note:  I'm not trimming comments very much in this reply.  My comments
are mostly short, but I think the context for the comments are
important and it was too much work to correclty paraphrase stuff.


In <200505060424(_dot_)05104(_dot_)bulk(_at_)mehnle(_dot_)net> Julian Mehnle 
<bulk(_at_)mehnle(_dot_)net> writes:

Wayne Schlitt wrote:

[snip]


-------------------------------------------------------------------------
--- draft-schlitt-spf-classic-01pre1.xml         Fri Mar 04 22:38:02 2005
+++ draft-schlitt-spf-classic-01pre1+mehnle.xml  Sat Mar 05 01:58:39 2005
@@ -1218,5 +1232,5 @@
       </t>
       <t>
-        Unrecognized modifiers SHOULD be ignored no matter where in
+        Unrecognized modifiers MUST be ignored no matter where in
         a record, nor how often.  This allows implementations of
         this document to gracefully handle records with modifiers that are
#
# What would be the effect of having "SHOULD" here instead of "MUST"?
# What happens if unrecognized modifiers are not ignored?  Should
# implementations really be allowed to die on unrecognized modifiers?
#
-------------------------------------------------------------------------

Your first reaction was:

As stated in the quoted paragraph, the purpose of the "SHOULD" is to
allow for implementations to define semantics for modifiers that are
not defined in this spec. Â If this was changed to "MUST", then that
would rule out the create of things like "ses=" and "accred=".

rejected.


For the record, this was my reply:
| Then perhaps the desired semantics need to be expressed more clearly, but
| in any event, the current text does not accurately describe the desired
| semantics. Â It falsely implies that receivers MAY die on unrecognized
| modifiers.

Not that I had a problem with the fact that you changed your mind and 
applied the change, but why did you?


Because Frank complained about the same thing and I gave up fighting
the point.

=========================================================================

Based on several reviewer's comments (Julian, Frank, and indirectly,
Andy), the most controversial item left appears to be what to do when
a non-existent domain is encountered.  Part of the problem is that
this subject has been controversial for a very long time and various
previous SPF specifications have said to do different things.
Moreover, different SPF implementations have done different things and
different versions of the same SPF implementation have done different
things.

Currently, in spf-classic-01pre4, non-existent and malformed domains
used to fetch SPF records are specified cause a result of "none"
(Sections 4.3), except in the case of the include: mechanism, in which
case it specified to cause a result of PermError (Section 5.2).

In draft-mengwong-spf-01, a domain that does not exist must return
"unknown" (now called PermError) in section 2.2.2, and for the
include:, it also says to return "unknown".

In draft-mengwong-spf-00, a domain that does not exist can return
either "none" or "fail" (section 2.2.2), and for the include:, it must
return "unknown" (now called PermError) in section 4.2.

I think that the current spf-classic-01pre4 makes the best choice of
the available, incompatible and inconsistent, options.  While this is
one area that I'm most open to changes, I think it is important not to
just give an off-the-top-of-my-head answer.  Please investigate
previous discussions on this subject, previous SPF specs, and
available SPF implementations.


Wayne, thanks a lot for researching the issue in this detail.

IMO, the draft-mengwong-spf-01 behavior, i.e. SPF(non-existent-domain) == 
"PermError", makes the most sense.  This was my suggestion, based on 
-01pre1:

-------------------------------------------------------------------------
--- draft-schlitt-spf-classic-01pre1.xml         Fri Mar 04 22:38:02 2005
+++ draft-schlitt-spf-classic-01pre1+mehnle.xml  Sat Mar 05 01:58:39 2005
@@ -255,4 +260,18 @@
         </t>
         <t>
+          SPF checks MUST NOT be applied to invalid identities, that
+          is, domains that do not exist.  The result of such a check
+          is an error.  Many mail receivers already discriminate
+          against messages that use identities with non-existent
+          domains, and while there are no strict requirements for the
+          identity used in the "HELO" SMTP command,
+          <xref target="rfc2821"/>, Section 3.6, requires the identity
+          used in the "EHLO" command to either be a valid host name or
+          an IP address literal.  So even if a mail receiver would not
+          normally validate the "HELO" identity, it must do so before
+          subjecting it to an SPF check, or expect the check to result
+          in an error.
+        </t>
+        <t>
           It is possible that mail receivers will use the SPF check as
           part of a larger set of tests on incoming mail.  The results
#
# Do not allow SPF to be applied to non-existent domains.
# This effectively redefines the result of SPF(non-existent-domain)
# from "None" to "PermError".  Also see below.
#
@@ -288,14 +307,4 @@
         </t>
         <t>
-          Note that the &lt;domain&gt; argument may not be a well
-          formed domain name.  For example, if the reverse-path was
-          null, then the EHLO or HELO domain is used.  In a valid SMTP
-          session, this can be an address literal or entirely
-          malformed.  In these cases, check_host() is defined in <xref
-          target="initial" /> to return a "None" result.
-          <!-- FIXME: should this be "None"? lots of conflict between -->
-          <!-- PermError in the various specs and implementations -->  
-        </t>
-        <t>
           Care must be taken to correctly extract the &lt;domain&gt;
           from the &lt;sender&gt; as many MTAs will still accept such
#
# Paragraph got replaced by the above.  Also see below.
#
@@ -661,5 +670,5 @@
           If the &lt;domain&gt; is malformed or is not a fully
           qualified domain name, check_host() immediately returns the
-          result "None".
+          result "PermError".
         </t>
         <t>
#
# Do not allow SPF to be applied to non-existent domains.
# This effectively redefines the result of SPF(non-existent-domain)
# from "None" to "PermError".  Also see above and below.
#
@@ -681,4 +690,9 @@
           "TempError"
         </t>
+        <t>
+          If the DNS lookup returns a Name Error (RCODE 3), the
+          &lt;domain&gt; does not exist, and check_host() exits
+          immediately with the result "PermError".
+        </t>
       </section>
       <section title="Selecting Records" anchor="version">
#
# Do not allow SPF to be applied to non-existent domains.
# This effectively redefines the result of SPF(non-existent-domain)
# from "None" to "PermError".  Also see above.
#
-------------------------------------------------------------------------

You rejected the change with an explanation (basically the same you gave 
above), plus you said:

Also, remember that lots of email from legitimate sources use invalid
HELO domains, and that trying to mandate the rejection of invalid
domains will not be practical.


Such a thing is _not_ implied by my change. Â Perhaps I should have worded
it a _bit_ differently. Â What I mean is that receivers should not apply
SPF to domains that are (or easily can be) known not to exist.

Receivers may do whatever they want with messages that use such obviously
non-existent ("invalid") identities: regularly accept them, accept but
mark them, reject them, delete them, whatever -- my change does not make
any statements on that. Â But receivers should not apply SPF to such
invalid identities, and SPF implementations should not treat such invalid
identities as if they were valid (and simply lacking an SPF record).

Also, the proposed semantics are consistent with what section 3.1.5 says:

| 3.1.5 Â Wildcard Records
|
| Â  Â [...]
|
| Â  Â Use of wildcards is discouraged in general as they cause every name
| Â  Â under the domain to exist and queries against arbitrary names will
| Â  Â never return RCODE 3 (Name Error).

Instead, I've added some language that stresses that, outside of SPF,
MTAs should reject email from invalid domains.


You probably mean this addition of yours:

| Â  Â While invalid, malformed, or non-existent domains cause SPF checks
| Â  Â to return "none" because no SPF record can be found, it has long
| Â  Â been the policy of many MTAs to reject email from such domains. Â It
| Â  Â is RECOMMENDED that email from invalid domains be rejected, in order
| Â  Â to prevent the circumvention of SPF records.

While this recommendation, if adhered to, has roughly the same effect as
my proposed change (even though the semantics are obviously not
identical), it implies recommending to reject on invalid HELO, not just
on invalid EHLO, and while I _personally_ agree with that (this is what I
as an admin would do), I don't think this is generally a good
recommendation from a standards engineering standpoint.

Formally, your approach is outside the scope of this specification. Â It 
might even be considered to require us to write "Updates: 2821" in the 
header of our upcoming SPF "Internet Draft".

I'd like to add:  It is clear that the primary objective of the upcoming 
Internet Draft should be to document existing practice, and that ensuring 
perfectly sensible behavior is secondary to that.  But here we have the 
problem that there is no consistent existing practice, so the question 
arises how to "best" codify it.

So, what implications would changing the result of SPF(non-existent-domain) 
from ("None"|"Fail") to "PermError" have?

Compared to the earlier original result of "Fail", "PermError" would be no 
significant change as receiver policies most probably don't differ much 
between "Fail" and "PermError".

Compared to the result of "None", "PermError" would be stricter, probably 
implying a rejection where one wouldn't have happened otherwise.

But considering that, like you said, most receivers probably reject 
messages with non-existent sender domains right away anyway, without 
subjecting them to SPF checks, I think defining SPF(non-existent-domain) 
== "PermError" is unproblematic and is logically the right thing to do.

People just shouldn't expect SPF to work on invalid input data.

=========================================================================


I'm interested in hearing what others have to say on this issue.
Right now, I'm not convinced that the draft should be changed.  Please
feel free to bring this issue up for a vote on the council.

-------------------------------------------------------------------------
--- draft-schlitt-spf-classic-01pre1.xml         Fri Mar 04 22:38:02 2005
+++ draft-schlitt-spf-classic-01pre1+mehnle.xml  Sat Mar 05 01:58:39 2005
@@ -1299,5 +1313,5 @@
         <t>
           If &lt;domain-spec&gt; is empty, or there are any processing
-          errors (any RCODE other than 0), or if no records are
+          errors (any RCODE other than 0 or 3), or if no records are
           returned, or if more than one record is returned, then
           proceed as if no exp modifier was given.
#
# Fix semantics.  Note that this is essentially unrelated to the
# "SPF(non-existent-domain) == PermError" series of changes.
#
-------------------------------------------------------------------------

You rejected it.  Could you please elaborate why?


a DNS RCODE of 3 means the domain doesn't exist.  Why shouldn't we go
ahead and use the default exp processing?  What are we supposed to do
with it?

=========================================================================

Also, I suggested renaming "prefix" to "sign" in the grammar:

-------------------------------------------------------------------------
--- draft-schlitt-spf-classic-01pre1.xml         Fri Mar 04 22:38:02 2005
+++ draft-schlitt-spf-classic-01pre1+mehnle.xml  Sat Mar 05 01:58:39 2005
@@ -744,6 +758,6 @@
 terms            = *( 1*SP ( directive / modifier ) )
 
-directive        = [ prefix ] mechanism
-prefix           = "+" / "-" / "?" / "~"
+directive        = [ sign ] mechanism
+sign             = "+" / "-" / "?" / "~"
 mechanism        = ( all / include
                    / A / MX / PTR / IP4 / IP6 / exists )
#
# Is "prefix" a highly traditional term?  If not, I'd strongly
# prefer the term "sign", which is more descriptive.
#
[...more related diff hunks...]
-------------------------------------------------------------------------

You rejected the change:

Prefix is a fairly traditional term. Â I really don't think "sign"
makes much sense, because neither "?" nor "~" are signs.


Not in a one-dimensional continuum, agreed. Â But there's more than just a
single "pass-fail" dimension in SPF results.

In any case, "prefix" is not very descriptive. Â It describes the
"prefix"'s syntactical function but not its semantical one. Â This is bad
style in a formal grammar.


I disagree.  Feel free to bring this up for a council vote.  I'm
interested in hearing other people's opinions on this though.

=========================================================================

In -01pre2, you added:

| 8.1 Â Macro definitions
|
| Â  Â [...]
|
|    Note: Care must be taken so that macro expansion for legitimate
|    e-mail does not exceed the 63 character limit on DNS labels.  The
|    localpart of e-mail addresses, in particular, can have more than 63
|    characters between dots.

Good. Â But this does not point out the implications for trying to use SRS
(i.e. potentially >=64 chars) localparts in "exists" mechanisms. Â With the
63 character limit, the according option 1.ii suggested in section 9.3 is
of questionable use for deployment purposes. Â I think a warning statement
should be included there. Â Perhaps:

-------------------------------------------------------------------------
--- draft-schlitt-spf-classic-01pre2.xml         Sat Mar 05 16:56:03 2005
+++ draft-schlitt-spf-classic-01pre2+mehnle.xml  Sat Mar 05 17:52:42 2005
@@ -1872,4 +1872,9 @@
                   While this requires an extra DNS lookup, this only
                   happens when the e-mail would otherwise be rejected.
+ Â  Â  Â  Â  Â  Â  Â  Â  Â Note that due to the 63 character limit for domain
+                  labels,Â this approach only works reliably if the
+                  localpart signature scheme is guaranteed either to only
+                  produceÂ localparts with a maximum of 63 characters or
+                  to gracefullyÂ handle truncated localparts.
                 </t>
                 <t>
-------------------------------------------------------------------------


Like MarkL, I don't think the RFC should be a "How-To guide".  The
stuff in section 9 is non-normative and is intended to give very brief
suggestions about how to solve certain problems that SPF adoption will
cause.

I'm not convinced that this information is useful enough to justify
being in the spec.  Feel free to bring it up for a vote on the
council.


-wayne