spf-discuss
[Top] [All Lists]

Updating SPF type99 and TXT RR's: Simultaneity is not guaranteed.

2005-08-09 22:59:45
Section 4.5. of the spec, "Selecting Records" says that if you're
querying for both SPF and TXT record types that:

|   2.  If there are both SPF and TXT records in the set and if
|       they are not all identical, return a "PermError".

This leads to a problem.

Problem:
--------

Imagine the following situation, starting with the following
assumptions:

1.   sender.example.com has matching spf records of type
     TXT and SPF.
2.   txt-only.example.com understands TXT but not SPF RR
     types.
3.   txt-and-spf.example.com understands both TXT and SPF
     RR types.
4.   both txt-only.example.com and txt-and-spf.example.com
     use isp-nameserver.example.com as their nameserver.

Then suppose the following happens:

5.   user(_at_)sender(_dot_)example(_dot_)com sends an email to 
     friend_one(_at_)txt-only(_dot_)example(_dot_)com

6.   txt-only.example.com requests sender.example.com's
     TXT-type spf record from isp-nameserver.example.com
 
     The nameserver looks up that record, provides it, and
     caches it.

7.   txt-only.example.com checks the email transaction against
     this spf record and the check passes.

8.   The owner of the sender.example.com domain then updates
     both the spf records for that domain, updating both
     the TXT and SPF resource records.

9.   user(_at_)example(_dot_)com then sends an email to 
     friend_two(_at_)txt-and-spf(_dot_)example(_dot_)com(_dot_)

10.  txt-and-spf.example.com then queries both types of SPF 
     records from isp-nameserver.example.com.

     The nameserver has the TXT record cached, so it
     returns that without any more querying.  However, it
     doesn't have the SPF records in cache, so it does
     query for that record.

     So the nameserver ends up returning a pair of records
     that don't match.

11.  txt-and-spf.example.com sees that the two records don't
     match, causing the spf test on this mail transaction
     to result in a permerror.

So basically, even when a domain owner updates both the SPF type99 and
SPF TXT records simultaneously, the mailservers of the recipients may
see different records for a time, (until the cached records' ttl's run
out.)

That's a situation that will resolve itself on its own, hence a
temperror is more appropriate than a permerror.

But then if the difference weren't due to the impossibility of a
simultaneous update from the point of view of every receiver in the
world's mx machine, and the difference won't go away after ttl's expire,
then a permerror is more appropriate.

But it's not possible to programmatically know whether the problem
really will correct itself--yet it's not possible for a receiver using
type99 and txt records to ever prevent this problem from occurring on an
update to the pair.

I don't know what the best solution would be.

One answer:  temperror
----------------------

One answer would be to give a temperror specifying that the sending
machine isn't to retry the mail transaction for at least the minimum
time remaining on either record.

That would solve the problem, in a
technically-correct-but-probably-completely-impractical way:  It would
mean that every time someone updated the records, that outgoing mail
could be on hold for the lesser ttl, which could be weeks!

Any documentation about publishing spf records would have to warn domain
owners of the need to lower the ttls of TXT and SPF records before a
change, similar to what's done when A records are changed.

Hardly anyone will remember to do that.

Another answer:  None
---------------------

Another option is to return a None instead of a permerror.

This keeps people like me who are in the permerrors-should-be-rejected
camp from rejecting this message.

(To be fair--ttls can cause permerrors to go on for a long time too, so
the problem with temperror exists in my reject-if-they-made-a-typo
response too.)

Third answer:  SPF-type99 has priority over TXT
-----------------------------------------------

Another option is for spf implementations to not be required to do the
comparison, perhaps simply giving SPF rr results priority over TXT rr
results.  (But perhaps allow them to send an informational DSN).

Conclusion:
-----------

I'm not sure what the best answer is, (though I'm obviously leaning
towards giving type99 priority over txt records.)  

However, the current situation where permerror is required if you query
for both records at the same time seems problematic.

(As a side note, there's another related, but far less serious issue:
If a domain publishes identical txt and spf rr's which include: a domain
that only publishes an SPF-type99 record, then recipients who check
TXT-only will return permerrors, while recipients who check either both
types or even just type99 will have no problems.)

-- 
Mark Shewmaker
mark(_at_)primefactor(_dot_)com