spf-discuss
[Top] [All Lists]

Re: Mask Syntax

2005-04-01 10:38:10
David MacQuigg wrote:
At 10:57 PM 3/31/2005 -0500, Radu wrote:

David MacQuigg wrote:

Records that are already compiled to a simple one-query list will be processed normally. Records that require multiple queries will not be evaluated at all and will result in a temporary reject. Adding the m= syntax allows us to reach a conclusion in *most* of those multi-query cases, even without the additional queries. In a DoS storm, this could be *almost all* cases.


Actually... 99.98% of records that require multiple queries will be correctly after only 1 query, since in a DNS storm the email comes from everywhere but the few places that the mask specifies. So the incoming MTAs don't need to change behaviour, as they will only do the 2nd query for those 0.02% of emails that actually come from the neighbourhood of the mask.


0.02% may be a bit optimistic. :>) I think attackers will search the planet for domains that have their authorized senders scattered widely among a bunch of dynamic IP blocks they can hijack. We may still need an ability to block even the initial SYN floods from those blocks. This is outside the scope of SPF, however.

Well, in another week or so, my patch to MyDNS will be complete, and I'll start it up in spfCache mode, and ask it about all the domains that I publish SPF records. Then we'll have a look at the stats of what the current real-world records fare, and how efficient the compiled masks are.

Since I got involved in this project, I think I've developed a liking of statistics. ;)

              --- Mask Syntax ---

Here is an example of what we could do with a huge domain like rr.com:
m=~24(181ecb,181cc8,181ccc,181eda,185d2f,181909,
411805,185ea6,181d6d,424ba2,181802,412005) ... -all
Existing checkers ignore the m= and process the ... with all the usual redirects and includes. Upgraded checkers will be able to take advantage of the new syntax. The ~ means the mask result is ANDed with the remaining ... result. If it fails to match, you are done.


Not quite so. the -all and m=~ are in conflict. The mask result will be the same as the "all" prefix.

Also, the mask is evaluated before the first redirect=, no sooner. So the IP4:'s in the first record are checked against the sender's IP *WITHOUT LOOKING AT THE MASK*. This is because the mask refers *ONLY* to IP4: mechanisms which are not in the current record. It would be a waste of space for the mask to also cover the mechanisms which are clearly visible in the first record.

Also, I would prefer to sticking to a mask representation that is easily human readable, so you can look at a record *set* and be able to easily tell if the mask is right or wrong, just by visual inspection and some simple arithmetic. If you can't easily tell, the trust in the mask is undermined (by humans). So when there are problems with the record, the first thing they do is disable mask generation.


Here are some suggestions on mask syntax. I think we should separate the goals of readability and versatility in the source record from compactness and efficiency in the compiled record. This will allow better optimization on both ends. If we really don't want users messing with the compiled record, we should not tempt them by making it easily editable. If you need to see what's in a compiled record, view it in the wizard.

Absolutely correct!

I mentioned this in the m= rules of engagement:

   2. If compiled with cron, or once in a while, -flatten should
   not be used. There will be left-over mechanisms whose
   resulting IP address may change (administrative gap)
   Thus, masks MUST not be added, since while they work initially
   they would break the record when the ISP changes their
   infrastructure.
   3. If compiled with cron, and -flatten not used, but the record
   compiles into a list of IPs anyway (ie, you list no mechanisms
   that lie outside your adminstrative boundary), then it may
   include masks if useful.

But perhaps it would be even safer if the stand-alone spfcompiler would not allow mask generation, just because this opportunity to mess with the record. It is tempting to add one more ip, or one more A record at the end of a record chain, if one is allowed to cut and paste it into a zone file.

Instead, libspf2 will include mask functionality, but only the DNS servers that implement compilers should actually use it. I am assuming that such a server will automatically overwrite the compiled record periodically as it falls out of sync with the source SPF record. My patch to MyDNS does this. Also, the user interface makes the compiled records read only.

So I think it would be best if wizards that allow the user to cut and paste a compiled record should stick to a plain compiled record without masks, or otherwise convince the user not to mess with the compiled record. In that case, the more cryptic the compiled record, the better.

Another goal should be to facilitate later migration to a syntax that needs nothing but a mask. Currently, we must make masks ignorable, but when I think of the final standard authentication method that may emerge from a Grand Unification by the IETF, I think of everything needed for authentication in one DNS packet. This would most likely include an IP mask for quick screening at the earliest opportunity, a public key for final validation at the destination, and links to any other authentication records that may be necessary. Compactness of the mask will be an issue, unless they upgrade DNS to use larger UDP packets.

I don't see how it may be possible for a close-ended representation to represent an open ended information. This goes back to random numbers and chaos theory. Truly random numbers are not compressible because they contain the maximum possible entropy.

So if a domain really does use 5000 outgoing servers, and the IPs are scattered all over, I don't see how all that info can fit in 512 bytes. The optimizer is essentially a specialized lossless compressor. So there will be a limit past which the input can not be compressed any further without losing information.

Here is a sequence of compactions for the SPF record of a typical large domain, starting with the source record. The ... is the current SPF syntax, and is necessary as long as there are legacy SPF checkers that ignore masks. The operator after the m= could be any visible character. In my hypothetical syntax, ~ means this is a supplemental mask, the original syntax is in ...
+ means this mask is the whole enchilada, you don't even need a -all.

v=spf1 ip4:24.30.203.0/24 ip4:24.28.200.0/24 ip4:24.28.204.0/24
ip4:24.30.218.0/24 ip4:24.93.47.0/24 ip4:24.25.9.0/24
ip4:65.24.5.0/24 ip4:24.94.166.0/24 ip4:24.29.109.0/24
ip4:66.75.162.0/24 ip4:24.24.2.0/24 ip4:65.32.5.0/24 ... -all

v=spf1 m=~24.30.203/24,24.28.200/24,24.28.204/24,24.30.218/24,
24.93.47/24,24.25.9/24,65.24.5/24,24.94.166/24,24.29.109/24,
66.75.162/24,24.24.2/24,65.32.5/24 ... -all

v=spf1 m=~24(24.30.203,24.28.200,24.28.204,24.30.218,24.93.47,
24.25.9,65.24.5,24.94.166,24.29.109,66.75.162,24.24.2,65.32.5)
 ... -all

v=spf1 m=+24(181ecb,181cc8,181ccc,181eda,185d2f,181909,411805,
185ea6,181d6d,424ba2,181802,412005)

v=spf1 m=+24(Kapi2RPMcR1CxEJdXOkLCFECMQDTO0fzuShRvL8q0m5sitIH)

v=spf1 m=+24,12,a2       |<       ---  36 bytes ---        >|

I appreciate the amount of work you put into presenting this, and I'm sorry to object, but when there is an "all", the redirect= modifier will be ignored, so you can't have daisy chaining.

I'm not sure I undersand what you are proposing. Are these records daisy-chained with redirect=, or are they equivalent of each other?

If you're suggesting that SPF become a binary blob instead of a text string, then it would be no longer compatible with a TXT record. Or at least some resolver libraries might choke on binary data in a TXT record. Some of those resolver libraries live in routers and other appliances that have a long life cycle, and will not be upgraded for SPF. Anyway, it would be a good idea to check with RFC1035.

Also, I think there's a bit of wisdom that we should not ignore.
Many protocols use human readable commands and terms, so that it's easy to see what the machines are doing. The SMTP conversation is a good example of that. The telnet conversation, likewise.

Since a very long SPF record that spans multiple TXT records is similar to a scripted conversation, I think it may be wise to keep that conversation as human readable as possible. I agree that it should be as compact as possible, but I think readability should not be sacrificed. Efficiency will be a by-product of those two requirements, (readability 1st and compactness 2nd).


We don't need to specify all this now. Just make sure there is plenty of flexibility in whatever we deploy early, so we don't have to deprecate stuff later. I think the m=<operator>... general pattern should work for anything I can anticipate.

Sure. We will need to think a bit about how the deprecation might evolve, as we need to specify rules of evaluation for m= now. For instance, m= must be evaluated before redirect= I don't think there will ever be a redirect= and an "all" in the same record based on the current rules of SPF evaluation.

That kind of a change will definately require the version number to be changed, as it will make new and old implementations incompatible.

But if you see a migration path that uses the mask to get to the 1-query goal, let's look at it now, before we cast in stone the rules of m=

Radu.







<Prev in Thread] Current Thread [Next in Thread>