Re: Mask Syntax
2005-04-01 10:38:10
David MacQuigg wrote:
At 10:57 PM 3/31/2005 -0500, Radu wrote:
David MacQuigg wrote:
Records that are already compiled to a simple one-query list will be
processed normally. Records that require multiple queries will not
be evaluated at all and will result in a temporary reject. Adding
the m= syntax allows us to reach a conclusion in *most* of those
multi-query cases, even without the additional queries. In a DoS
storm, this could be *almost all* cases.
Actually... 99.98% of records that require multiple queries will be
correctly after only 1 query, since in a DNS storm the email comes
from everywhere but the few places that the mask specifies. So the
incoming MTAs don't need to change behaviour, as they will only do the
2nd query for those 0.02% of emails that actually come from the
neighbourhood of the mask.
0.02% may be a bit optimistic. :>) I think attackers will search the
planet for domains that have their authorized senders scattered widely
among a bunch of dynamic IP blocks they can hijack. We may still need
an ability to block even the initial SYN floods from those blocks. This
is outside the scope of SPF, however.
Well, in another week or so, my patch to MyDNS will be complete, and
I'll start it up in spfCache mode, and ask it about all the domains that
I publish SPF records. Then we'll have a look at the stats of what the
current real-world records fare, and how efficient the compiled masks are.
Since I got involved in this project, I think I've developed a liking of
statistics. ;)
--- Mask Syntax ---
Here is an example of what we could do with a huge domain like rr.com:
m=~24(181ecb,181cc8,181ccc,181eda,185d2f,181909,
411805,185ea6,181d6d,424ba2,181802,412005) ... -all
Existing checkers ignore the m= and process the ... with all the
usual redirects and includes. Upgraded checkers will be able to take
advantage of the new syntax. The ~ means the mask result is ANDed
with the remaining ... result. If it fails to match, you are done.
Not quite so. the -all and m=~ are in conflict. The mask result will
be the same as the "all" prefix.
Also, the mask is evaluated before the first redirect=, no sooner. So
the IP4:'s in the first record are checked against the sender's IP
*WITHOUT LOOKING AT THE MASK*. This is because the mask refers *ONLY*
to IP4: mechanisms which are not in the current record. It would be a
waste of space for the mask to also cover the mechanisms which are
clearly visible in the first record.
Also, I would prefer to sticking to a mask representation that is
easily human readable, so you can look at a record *set* and be able
to easily tell if the mask is right or wrong, just by visual
inspection and some simple arithmetic. If you can't easily tell, the
trust in the mask is undermined (by humans). So when there are
problems with the record, the first thing they do is disable mask
generation.
Here are some suggestions on mask syntax. I think we should separate
the goals of readability and versatility in the source record from
compactness and efficiency in the compiled record. This will allow
better optimization on both ends. If we really don't want users messing
with the compiled record, we should not tempt them by making it easily
editable. If you need to see what's in a compiled record, view it in
the wizard.
Absolutely correct!
I mentioned this in the m= rules of engagement:
2. If compiled with cron, or once in a while, -flatten should
not be used. There will be left-over mechanisms whose
resulting IP address may change (administrative gap)
Thus, masks MUST not be added, since while they work initially
they would break the record when the ISP changes their
infrastructure.
3. If compiled with cron, and -flatten not used, but the record
compiles into a list of IPs anyway (ie, you list no mechanisms
that lie outside your adminstrative boundary), then it may
include masks if useful.
But perhaps it would be even safer if the stand-alone spfcompiler would
not allow mask generation, just because this opportunity to mess with
the record. It is tempting to add one more ip, or one more A record at
the end of a record chain, if one is allowed to cut and paste it into a
zone file.
Instead, libspf2 will include mask functionality, but only the DNS
servers that implement compilers should actually use it. I am assuming
that such a server will automatically overwrite the compiled record
periodically as it falls out of sync with the source SPF record. My
patch to MyDNS does this. Also, the user interface makes the compiled
records read only.
So I think it would be best if wizards that allow the user to cut and
paste a compiled record should stick to a plain compiled record without
masks, or otherwise convince the user not to mess with the compiled
record. In that case, the more cryptic the compiled record, the better.
Another goal should be to facilitate later migration to a syntax that
needs nothing but a mask. Currently, we must make masks ignorable, but
when I think of the final standard authentication method that may emerge
from a Grand Unification by the IETF, I think of everything needed for
authentication in one DNS packet. This would most likely include an IP
mask for quick screening at the earliest opportunity, a public key for
final validation at the destination, and links to any other
authentication records that may be necessary. Compactness of the mask
will be an issue, unless they upgrade DNS to use larger UDP packets.
I don't see how it may be possible for a close-ended representation to
represent an open ended information. This goes back to random numbers
and chaos theory. Truly random numbers are not compressible because they
contain the maximum possible entropy.
So if a domain really does use 5000 outgoing servers, and the IPs are
scattered all over, I don't see how all that info can fit in 512 bytes.
The optimizer is essentially a specialized lossless compressor. So there
will be a limit past which the input can not be compressed any further
without losing information.
Here is a sequence of compactions for the SPF record of a typical large
domain, starting with the source record. The ... is the current SPF
syntax, and is necessary as long as there are legacy SPF checkers that
ignore masks. The operator after the m= could be any visible
character. In my hypothetical syntax, ~ means this is a supplemental
mask, the original syntax is in ...
+ means this mask is the whole enchilada, you don't even need a -all.
v=spf1 ip4:24.30.203.0/24 ip4:24.28.200.0/24 ip4:24.28.204.0/24
ip4:24.30.218.0/24 ip4:24.93.47.0/24 ip4:24.25.9.0/24
ip4:65.24.5.0/24 ip4:24.94.166.0/24 ip4:24.29.109.0/24
ip4:66.75.162.0/24 ip4:24.24.2.0/24 ip4:65.32.5.0/24 ... -all
v=spf1 m=~24.30.203/24,24.28.200/24,24.28.204/24,24.30.218/24,
24.93.47/24,24.25.9/24,65.24.5/24,24.94.166/24,24.29.109/24,
66.75.162/24,24.24.2/24,65.32.5/24 ... -all
v=spf1 m=~24(24.30.203,24.28.200,24.28.204,24.30.218,24.93.47,
24.25.9,65.24.5,24.94.166,24.29.109,66.75.162,24.24.2,65.32.5)
... -all
v=spf1 m=+24(181ecb,181cc8,181ccc,181eda,185d2f,181909,411805,
185ea6,181d6d,424ba2,181802,412005)
v=spf1 m=+24(Kapi2RPMcR1CxEJdXOkLCFECMQDTO0fzuShRvL8q0m5sitIH)
v=spf1 m=+24,12,a2 |< --- 36 bytes --- >|
I appreciate the amount of work you put into presenting this, and I'm
sorry to object, but when there is an "all", the redirect= modifier will
be ignored, so you can't have daisy chaining.
I'm not sure I undersand what you are proposing. Are these records
daisy-chained with redirect=, or are they equivalent of each other?
If you're suggesting that SPF become a binary blob instead of a text
string, then it would be no longer compatible with a TXT record. Or at
least some resolver libraries might choke on binary data in a TXT
record. Some of those resolver libraries live in routers and other
appliances that have a long life cycle, and will not be upgraded for
SPF. Anyway, it would be a good idea to check with RFC1035.
Also, I think there's a bit of wisdom that we should not ignore.
Many protocols use human readable commands and terms, so that it's easy
to see what the machines are doing. The SMTP conversation is a good
example of that. The telnet conversation, likewise.
Since a very long SPF record that spans multiple TXT records is similar
to a scripted conversation, I think it may be wise to keep that
conversation as human readable as possible. I agree that it should be as
compact as possible, but I think readability should not be sacrificed.
Efficiency will be a by-product of those two requirements, (readability
1st and compactness 2nd).
We don't need to specify all this now. Just make sure there is plenty
of flexibility in whatever we deploy early, so we don't have to
deprecate stuff later. I think the m=<operator>... general pattern
should work for anything I can anticipate.
Sure. We will need to think a bit about how the deprecation might
evolve, as we need to specify rules of evaluation for m= now.
For instance, m= must be evaluated before redirect= I don't think there
will ever be a redirect= and an "all" in the same record based on the
current rules of SPF evaluation.
That kind of a change will definately require the version number to be
changed, as it will make new and old implementations incompatible.
But if you see a migration path that uses the mask to get to the 1-query
goal, let's look at it now, before we cast in stone the rules of m=
Radu.
|
|