Re: Mask Syntax

David MacQuigg wrote:

At 10:57 PM 3/31/2005 -0500, Radu wrote:
David MacQuigg wrote:
Records that are already compiled to a simple one-query list will beprocessed normally. Records that require multiple queries will notbe evaluated at all and will result in a temporary reject. Addingthe m= syntax allows us to reach a conclusion in *most* of thosemulti-query cases, even without the additional queries. In a DoSstorm, this could be *almost all* cases.
Actually... 99.98% of records that require multiple queries will becorrectly after only 1 query, since in a DNS storm the email comesfrom everywhere but the few places that the mask specifies. So theincoming MTAs don't need to change behaviour, as they will only do the2nd query for those 0.02% of emails that actually come from theneighbourhood of the mask.
0.02% may be a bit optimistic. :>) I think attackers will search theplanet for domains that have their authorized senders scattered widelyamong a bunch of dynamic IP blocks they can hijack. We may still needan ability to block even the initial SYN floods from those blocks. Thisis outside the scope of SPF, however.

Well, in another week or so, my patch to MyDNS will be complete, andI'll start it up in spfCache mode, and ask it about all the domains thatI publish SPF records. Then we'll have a look at the stats of what thecurrent real-world records fare, and how efficient the compiled masks are.

Since I got involved in this project, I think I've developed a liking ofstatistics. ;)

              --- Mask Syntax ---
Here is an example of what we could do with a huge domain like rr.com:
m=~24(181ecb,181cc8,181ccc,181eda,185d2f,181909,
411805,185ea6,181d6d,424ba2,181802,412005) ... -all
Existing checkers ignore the m= and process the ... with all theusual redirects and includes. Upgraded checkers will be able to takeadvantage of the new syntax. The ~ means the mask result is ANDedwith the remaining ... result. If it fails to match, you are done.
Not quite so. the -all and m=~ are in conflict. The mask result willbe the same as the "all" prefix.
Also, the mask is evaluated before the first redirect=, no sooner. Sothe IP4:'s in the first record are checked against the sender's IP*WITHOUT LOOKING AT THE MASK*. This is because the mask refers *ONLY*to IP4: mechanisms which are not in the current record. It would be awaste of space for the mask to also cover the mechanisms which areclearly visible in the first record.
Also, I would prefer to sticking to a mask representation that iseasily human readable, so you can look at a record *set* and be ableto easily tell if the mask is right or wrong, just by visualinspection and some simple arithmetic. If you can't easily tell, thetrust in the mask is undermined (by humans). So when there areproblems with the record, the first thing they do is disable maskgeneration.
Here are some suggestions on mask syntax. I think we should separatethe goals of readability and versatility in the source record fromcompactness and efficiency in the compiled record. This will allowbetter optimization on both ends. If we really don't want users messingwith the compiled record, we should not tempt them by making it easilyeditable. If you need to see what's in a compiled record, view it inthe wizard.


Absolutely correct!

I mentioned this in the m= rules of engagement:

   2. If compiled with cron, or once in a while, -flatten should
   not be used. There will be left-over mechanisms whose
   resulting IP address may change (administrative gap)
   Thus, masks MUST not be added, since while they work initially
   they would break the record when the ISP changes their
   infrastructure.
   3. If compiled with cron, and -flatten not used, but the record
   compiles into a list of IPs anyway (ie, you list no mechanisms
   that lie outside your adminstrative boundary), then it may
   include masks if useful.

But perhaps it would be even safer if the stand-alone spfcompiler wouldnot allow mask generation, just because this opportunity to mess withthe record. It is tempting to add one more ip, or one more A record atthe end of a record chain, if one is allowed to cut and paste it into azone file.

Instead, libspf2 will include mask functionality, but only the DNSservers that implement compilers should actually use it. I am assumingthat such a server will automatically overwrite the compiled recordperiodically as it falls out of sync with the source SPF record. Mypatch to MyDNS does this. Also, the user interface makes the compiledrecords read only.

So I think it would be best if wizards that allow the user to cut andpaste a compiled record should stick to a plain compiled record withoutmasks, or otherwise convince the user not to mess with the compiledrecord. In that case, the more cryptic the compiled record, the better.

Another goal should be to facilitate later migration to a syntax thatneeds nothing but a mask. Currently, we must make masks ignorable, butwhen I think of the final standard authentication method that may emergefrom a Grand Unification by the IETF, I think of everything needed forauthentication in one DNS packet. This would most likely include an IPmask for quick screening at the earliest opportunity, a public key forfinal validation at the destination, and links to any otherauthentication records that may be necessary. Compactness of the maskwill be an issue, unless they upgrade DNS to use larger UDP packets.

I don't see how it may be possible for a close-ended representation torepresent an open ended information. This goes back to random numbersand chaos theory. Truly random numbers are not compressible because theycontain the maximum possible entropy.

So if a domain really does use 5000 outgoing servers, and the IPs arescattered all over, I don't see how all that info can fit in 512 bytes.The optimizer is essentially a specialized lossless compressor. So therewill be a limit past which the input can not be compressed any furtherwithout losing information.

Here is a sequence of compactions for the SPF record of a typical largedomain, starting with the source record. The ... is the current SPFsyntax, and is necessary as long as there are legacy SPF checkers thatignore masks. The operator after the m= could be any visiblecharacter. In my hypothetical syntax, ~ means this is a supplementalmask, the original syntax is in ...
+ means this mask is the whole enchilada, you don't even need a -all.

v=spf1 ip4:24.30.203.0/24 ip4:24.28.200.0/24 ip4:24.28.204.0/24
ip4:24.30.218.0/24 ip4:24.93.47.0/24 ip4:24.25.9.0/24
ip4:65.24.5.0/24 ip4:24.94.166.0/24 ip4:24.29.109.0/24
ip4:66.75.162.0/24 ip4:24.24.2.0/24 ip4:65.32.5.0/24 ... -all

v=spf1 m=~24.30.203/24,24.28.200/24,24.28.204/24,24.30.218/24,
24.93.47/24,24.25.9/24,65.24.5/24,24.94.166/24,24.29.109/24,
66.75.162/24,24.24.2/24,65.32.5/24 ... -all

v=spf1 m=~24(24.30.203,24.28.200,24.28.204,24.30.218,24.93.47,
24.25.9,65.24.5,24.94.166,24.29.109,66.75.162,24.24.2,65.32.5)
 ... -all

v=spf1 m=+24(181ecb,181cc8,181ccc,181eda,185d2f,181909,411805,
185ea6,181d6d,424ba2,181802,412005)

v=spf1 m=+24(Kapi2RPMcR1CxEJdXOkLCFECMQDTO0fzuShRvL8q0m5sitIH)

v=spf1 m=+24,12,a2       |<       ---  36 bytes ---        >|

I appreciate the amount of work you put into presenting this, and I'msorry to object, but when there is an "all", the redirect= modifier willbe ignored, so you can't have daisy chaining.

I'm not sure I undersand what you are proposing. Are these recordsdaisy-chained with redirect=, or are they equivalent of each other?

If you're suggesting that SPF become a binary blob instead of a textstring, then it would be no longer compatible with a TXT record. Or atleast some resolver libraries might choke on binary data in a TXTrecord. Some of those resolver libraries live in routers and otherappliances that have a long life cycle, and will not be upgraded forSPF. Anyway, it would be a good idea to check with RFC1035.


Also, I think there's a bit of wisdom that we should not ignore.

Many protocols use human readable commands and terms, so that it's easyto see what the machines are doing. The SMTP conversation is a goodexample of that. The telnet conversation, likewise.

Since a very long SPF record that spans multiple TXT records is similarto a scripted conversation, I think it may be wise to keep thatconversation as human readable as possible. I agree that it should be ascompact as possible, but I think readability should not be sacrificed.Efficiency will be a by-product of those two requirements, (readability1st and compactness 2nd).

We don't need to specify all this now. Just make sure there is plentyof flexibility in whatever we deploy early, so we don't have todeprecate stuff later. I think the m=<operator>... general patternshould work for anything I can anticipate.

Sure. We will need to think a bit about how the deprecation mightevolve, as we need to specify rules of evaluation for m= now.For instance, m= must be evaluated before redirect= I don't think therewill ever be a redirect= and an "all" in the same record based on thecurrent rules of SPF evaluation.

That kind of a change will definately require the version number to bechanged, as it will make new and old implementations incompatible.

But if you see a migration path that uses the mask to get to the 1-querygoal, let's look at it now, before we cast in stone the rules of m=


Radu.