Re: short circuiting evaluation
2005-03-25 11:38:56
Andy Bakun wrote:
This TTL discussion on the A records that exists uses is academic, as
I've changed my position on it absolutely needing to be low.
I think changing the TTL based on load is extremely dangerous. I assume
you mean that the TTL is increased when the load increases.
The TTL I was suggesting for these exists: records is either around zero
or significantly smaller than you'd normally set the TTL to if you
wanted to avoid significant downtime, such that even an increase like 2x
or 3x is still less than 5 or 10 minutes.
If you're getting hammered with queries at x per second with a TTL of 10
seconds, then your load would be x/2 if you increase the TTL to 20
seconds. This isn't significantly different, but when there's an attack
going on, this would reduce your load without significantly hindering
your ability to fail over.
So if
something fails under the heavier load and you have to relocate it,
you'll suffer longer downtime because the TTLs are longer.
There's also nothing keeping it from going the other way -- set the TTL
to 1 hour in the normal case. If it looks like your load is up because
your domain is being forged, decrease it so that in the case you do need
to fail over, your downtime window is decreased. That is, if you've
been serving TTLs of 1hour, and you change to 30 minutes, then cache
entries will expire in an average of 45 minutes, thereby reducing the
length your downtime/inaccessibility due to caching.
The instant you publish a 30-minute record, all the records you've
issues this far have been 1-hour. So all records out there have 1-hour
expirations. Changing the published TTL does not instantly change the
average TTL out there, but it would take 1 hour for all the previously
issued to expire and be refreshed with records of the new limit.
Incidentally, if you have a spread of 1-hour TTLs out in the field,
their average time-to-refresh, or remaining TTL is 30 minutes. If you
have a spread of 30-minute TTLs, the average time-to-refresh is 15
minutes. So if you do what you said, the average TTL out there would
drop from 30-minutes to 15-minutes over the course of 1-hour.
I'll think about whether it would actually help in this
scenario, but for a typical case, say the TTL of a web-server address,
it's definately a bad idea to increase the TTL as you're pushing the
equiment closer to failure.
Sure, except we are specifically not talking about "typical cases" here,
and especially not web-servers. If I send email and your server is
overloaded, I may get a DSN from my server saying it temporarily can't
connect, but assuming your load comes down in a reasonable amount of
time, the mail will go through with no action on any one else's part.
Web servers are really different, in that if your retail website isn't
responding, people will immediately go to your competitors.
In the "typical case", email goes through. The atypical case is the
mythical SPF-doom virus that is pounding on mail servers causing mail
servers to pound on DNS through SPF. I thought we were trying to
optimize for the atypical case here.
Slight correction: The virus _taps_ on MTAs, and MTAs _pound_ on DNS. ;)
You're right, the mail situation is somewhat different. While I consider
a 4-hour delay in email unnacceptable, ymmv. 4-hours is the default time
after which sendmail warns that it hasn't been able to deliver. But when
it gives the warning it still means it's unknown how much longer it'll
have to try.
But let's not concentrate on that since it takes energy away from the
real discussion. We can start a separate thread for fail-over if you like.
Ok, but let's look at a higher connection rate for a second.
If you get 255 connections from around the world, 1 from each A class
net, you have to do 1 query (TXT) + 243 A (for those which are in
different class A nets than ebay's servers) + 8 queries for those that
are in the same class A nets as ebay.
This is a fine thought experiment, but most likely not that realistic.
Chances are, most, if not all, zombied machines are going to come from
some small (and maybe even predictable) set of class A addresses during
any given single attack. It seems most of the single digit class As are
out immediately, for example. I'd think large attacks are going to come
from IP blocks that are hosting connectivity services sold to the
public/consumers.
Very well. I used my 3-month long maillog as a research resource again,
and I found that my mail server's port 25 is accessible from most
corners of the world. Below I show the distribution of incoming
connections from different class A hosts.
My host sees very modest volumes of SMTP activity. Thus it checks very
few SPF records. A more central mail server would see far more
connections. Since my total number of connections per day is about 48.
I will try to estimate what a more central mail server might see, and
thus I multiplied the number of connections I see by a factor of 20
(still very modest). Hotmail bounces 2-billion spams a day. A large site
like that would have to do even more exists queries than my setup*20.
They may get much closer to the worst case I described before. But let's
see how close would a very modest _realistic_ case get.
Anyway, I have applied the distribution I saw at my server, but used 20x
the volume, and I will compare how many queries your exists method might
require, vs. how many the mask method might require:
Please note that multiplying the number of connections does not change
the shape of the distribution. For A nets that I saw no connections
froms, the multiplication will not create any connections. However, for
nets where I get a small number of daily connects on average, that daily
average will be increased by 20x (ex. From net 212 I see about 0.3
connections per day; after the volume increase, I would see about 6
connections per day - the table rounds 6.7 to 7 for net 212)
Column A is the Class A net.
Column B is the total number of connections from an IP in that net that
I have seen in the last 3 months
Column C is the percentage out of total number of connects.
Column D is the average daily number of connects from that net (col B/90)
Column E is the estimated number of connects for a more central site
(ohmi's average daily connects * 20). I will refer to this as
connections_per_24H.
For the exists column (F), I have used the following formula:
if (connections_per_24H > 24) queries = 24;
else queries = connections_per_24H
Column G is the number of MX queries you'd have to do. I have assumed
that we keep to the ebay example. In that context, your published TXT
records would be "v=spf1 -exists:%{ir1}._spf.%{d} +... the long list of
IPs, spread over the same number of _s extensions as my example" Perhaps
since my mask takes 61 bytes, and your mech takes 24 bytes, in your case
the ebay record may be shorter by 1 TXT query.
Column H is the number of TXT queries that you'd have to do with the
exists method (=1+7*column_G). I gave you the benefit of the doubt and
assumed that your functionally identical record to mine would be 8 TXT
records instead of 9 because of the shorter mask string.
For the column J I used the number of TXT queries that my previous ebay
record would generate (=1+8*column_G), if we used the mask method, with
a mask equivalent to your "exist mask":
-m=64/6 m=80 m=194 m=203 m=206 m=209 m=210 m=212 m=216 m=220
I have already pointed out that this mask is very poor. A better mask
would have higher CIDRs and therefore be more narrow. But I want to
compare apples to apples. The information published by the DNS server is
the same in both cases, except it is published in different ways. In my
way, it is published as the above 61 string appended to the the top TXT
record at ebay.com, and in your case, it would be published with the
254-13=243 A records generated by the compiler. Actually in both cases,
the information is generated by the compiler.
Since my mask can be made narrower without adding extra queries, the
numbers in column J should read "<=", ie at _at_most_ 9 queries.
Nonetheless, the total for the column G is based on 9 queries. It looks
like in the past 3 months I got no connects from one of the nets that
ebay uses, otherwise the total would have been 9*24
A B C D E F G H J
65 849 19.6% 9.4 188.7 24 24 192 216
66 562 13.0% 6.2 124.9 24 24 192 216
61 208 4.8% 2.3 46.2 24 0 1 1
24 177 4.1% 2.0 39.3 24 0 1 1
211 177 4.1% 2.0 39.3 24 0 1 1
218 153 3.5% 1.7 34.0 24 0 1 1
200 130 3.0% 1.4 28.9 24 0 1 1
69 112 2.6% 1.2 24.9 24 0 1 1
220 110 2.5% 1.2 24.4 24 0 1 1
82 108 2.5% 1.2 24.0 24 0 1 1
202 101 2.3% 1.1 22.4 22 0 1 1
222 97 2.2% 1.1 21.6 22 0 1 1
68 89 2.1% 1.0 19.8 20 0 1 1
221 85 2.0% 0.9 18.9 19 0 1 1
81 74 1.7% 0.8 16.4 16 0 1 1
207 69 1.6% 0.8 15.3 15 0 1 1
210 66 1.5% 0.7 14.7 15 15 120 135
132 63 1.5% 0.7 14.0 14 0 1 1
219 59 1.4% 0.7 13.1 13 0 1 1
38 57 1.3% 0.6 12.7 13 0 1 1
67 56 1.3% 0.6 12.4 12 12 96 108
80 56 1.3% 0.6 12.4 12 12 96 108
213 56 1.3% 0.6 12.4 12 0 1 1
83 54 1.2% 0.6 12.0 12 0 1 1
4 51 1.2% 0.6 11.3 11 0 1 1
62 49 1.1% 0.5 10.9 11 0 1 1
217 48 1.1% 0.5 10.7 11 0 1 1
64 44 1.0% 0.5 9.8 10 10 80 90
84 43 1.0% 0.5 9.6 10 0 1 1
203 42 1.0% 0.5 9.3 9 9 72 81
216 38 0.9% 0.4 8.4 8 8 64 72
201 36 0.8% 0.4 8.0 8 0 1 1
206 36 0.8% 0.4 8.0 8 8 64 72
209 33 0.8% 0.4 7.3 7 7 56 63
192 32 0.7% 0.4 7.1 7 0 1 1
194 31 0.7% 0.3 6.9 7 7 56 63
212 30 0.7% 0.3 6.7 7 7 56 63
85 22 0.5% 0.2 4.9 5 0 1 1
60 21 0.5% 0.2 4.7 5 0 1 1
12 20 0.5% 0.2 4.4 4 0 1 1
70 19 0.4% 0.2 4.2 4 0 1 1
168 17 0.4% 0.2 3.8 4 0 1 1
63 15 0.3% 0.2 3.3 3 0 1 1
204 13 0.3% 0.1 2.9 3 0 1 1
205 13 0.3% 0.1 2.9 3 0 1 1
59 11 0.3% 0.1 2.4 2 0 1 1
195 10 0.2% 0.1 2.2 2 0 1 1
129 8 0.2% 0.1 1.8 2 0 1 1
198 8 0.2% 0.1 1.8 2 0 1 1
193 7 0.2% 0.1 1.6 2 0 1 1
151 6 0.1% 0.1 1.3 1 0 1 1
163 6 0.1% 0.1 1.3 1 0 1 1
131 5 0.1% 0.1 1.1 1 0 1 1
144 5 0.1% 0.1 1.1 1 0 1 1
208 5 0.1% 0.1 1.1 1 0 1 1
130 4 0.1% 0.0 0.9 1 0 1 1
141 4 0.1% 0.0 0.9 1 0 1 1
71 3 0.1% 0.0 0.7 1 0 1 1
136 3 0.1% 0.0 0.7 1 0 1 1
148 3 0.1% 0.0 0.7 1 0 1 1
161 3 0.1% 0.0 0.7 1 0 1 1
166 3 0.1% 0.0 0.7 1 0 1 1
196 3 0.1% 0.0 0.7 1 0 1 1
128 2 0.0% 0.0 0.4 0 0 1 1
138 2 0.0% 0.0 0.4 0 0 1 1
167 2 0.0% 0.0 0.4 0 0 1 1
43 1 0.0% 0.0 0.2 0 0 1 1
58 1 0.0% 0.0 0.2 0 0 1 1
133 1 0.0% 0.0 0.2 0 0 1 1
142 1 0.0% 0.0 0.2 0 0 1 1
145 1 0.0% 0.0 0.2 0 0 1 1
149 1 0.0% 0.0 0.2 0 0 1 1
150 1 0.0% 0.0 0.2 0 0 1 1
152 1 0.0% 0.0 0.2 0 0 1 1
155 1 0.0% 0.0 0.2 0 0 1 1
157 1 0.0% 0.0 0.2 0 0 1 1
159 1 0.0% 0.0 0.2 0 0 1 1
162 1 0.0% 0.0 0.2 0 0 1 1
164 1 0.0% 0.0 0.2 0 0 1 1
165 1 0.0% 0.0 0.2 0 0 1 1
Total: 4338 48.2 964 625 24 192 216
Column F total is the sum, because each of the lines in the table
queries a different hostname, and thus they are separate cache groups.
Columns G, H, J are maximums, as all lines queries the same hostname,
and thus they are part of the same cache group. (If you query all 9 TXT
records for a connection from net 210, you will not query them again for
a connection from another net because all 9 records are already in the
local cache).
So in this realistic example, you generate 625+25+192=842 queries, while
I generate fewer than 216. My mask can be improved a lot without adding
extra queries, and that would make my total of 216 be even lower. The
better the mask, the lower the query count. In order for your mask to be
more narrow, you have to publish more A records, as you have shown
(GENERATE 1.$ example), which would generate even more queries. So as
compiler tries to make the record more efficient, in my mask case it
increases the mask from 61 bytes to longer, and decreases the number of
lookups (probably drastically). In your case, the top record remains the
same length, as it changes from %{1ir} to %{2ir}, but now it requires a
query for every unique request from a class B network, which is brings
your total for column F to 625*625, as each line in my table blows up to
254 lines to accomodate all the class Bs. Maybe less that 625*625, but
the increase in query requirement still grows geometrically.
Also note the way the distribution tails. Because I don't have a very
large volume, some of the nets I only once see every 3 months. Others I
see even less frequently. Because this distribution asymptotically tends
to zero, it means that if the volume of connections were higher, I'd see
that my host is accessible by SMTP from even more Class A nets.
If the distribution would have dropped sharply, such that I would have
seen no fewer than N>2 connections from any one Class A net, I could
have _assumed_ that I have seen connections from all the possible Class
A networks that a connection could come from.
This conclusion about distribution works against your exists proposal,
and in favour of my mask proposal, as you can see.
1 for the exists and 1 for the MX (if the entire MX list fits in the
additional portion of the MX response). In any case, this gains back
some of the usefulness of the other mechanisms without having to
recompile (or test for needing to recompile) continually and without
forcing their complex evaluation in all instances. The cache expire
time for the records used in exists should definitely be kept low.
Excellent! so let's look at a 24 hour period. Say that we get 2540
connections per hour, 10 from each class A network. Let's assume a TTL
of 24H for the MX, 1H for the exists records, and 1H for the TXT record.
Recall I explained why the exists records have the same TTL as the TXT
record.
Total traffic with your method:
1*MX + 24*TXT + 24*254*A = 6121 queries during the 24H period.
In total, you called ns_resolv 24*2540*3 times (182880 times). So the
cache saved you traffic 96.6% of the time.
This is a very convenient calculation that makes my masking method using
exists look significantly worse. I don't believe it actually needs to
be that bad. You assume that all the queries in exists would need a
short TTL, or even the same TTL, and I initially agreed because of the
failover scenario. One of the advantages of my method, even taking into
account your "I need a short TTL so I can fail over" scenario, is that
all the other SPF mechanisms are usable (as long that they don't cross
administrative boundaries where you don't know how things could change)
without having to recompile the record at all.
I have taken this into account in the analysis above. It was my honest
mistake, and I apologize.
You should keep the TTL for any given A record used in exists low if you
plan on using an addresses in that class A as part of your failover
plan. Fortunately, most of them won't be used. If you're the kind of
person who is prepared for failover such that the TTL is a concern, you
already know where you are going to failover to (it may even be one of
the addresses that is already listed in the MX). Say my MX is on 1/8
and my failover is at a different ISP (which is otherwise unlisted, not
even as a backup MX) on 2/8. I have these records:
24h IN TXT "v=spf1 -exists:%{ir1}._spf.%{d}"
" +mx -all"
1h IN MX 10 mailhost
$GENERATE 3-254 $._spf 24h IN A 127.0.0.1
2._spf 1h IN A 127.0.0.1
mailhost 1h IN A 1.1.1.1
I would prefer to keep to our ebay example.
But let's look quickly at this new example you propose.
Both the "exists mask" and the "modifier mask" would be inserted by the
compiler, as it sees necessary to save traffic, right?
In that case, the compiler would never insert the exist mechanism above,
since it only makes the matters worse.
It would compile that record simply to
1h IN TXT "v=spf1 ip4:1.1.1.1 -all"
This record would be queried at most 24 times in a day, and it preserves
the initial intent that the mailhost may be moved at 1-hour's notice.
Even if it used my mask modifiers, it would not add any, since all the
addresses are visible in the first TXT query, and there would be no
subsequent queries. No possible savings, so no need for any masking.
But since there's no possible savings, there's no need for masking of
any kind. That's why I wanted to stick to the ebay example, because even
when it's fully optimized it doesn't fit in 1 UDP packet, so it must be
break-up into multiple queries. That's where the masks shine, in
avoiding subsequent queries, when it is not possible to avoid them by
compiling everything into a list of IPs that would fit in a UDP packet.
That is, the TXT and 252 of the class A exists records are cachable for
24 hours, and the ones I need to change if I fail over (two As and the
MX) are 1 hour. At 2540 connections per hour, 10 from each class A,
this design makes
24*MX + 1*TXT + 1*252*A + 24*2A = 325 queries
calling ns_resolv 24*2540*3 (182880) times, with a cache hit percentage
of 99.82%
Correct. 325 queries vs. 24. But I believe you made an error in saying
that any masking is necessary in this simple case. Once you take the
masking away, there's nothing left to compare :)
(like we have previously, I'm again assuming the load of 1*MX includes
the lookup of the resultant As, thus it's fixed).
I have done the same, in the interest of comparing apples to apples. If
we're both right, then great, and if we're both wrong, our calculations
would be off by the same factor, so the comparison is still valid :)
By taking our actual
current and failover network information into account, the number of
queries have been reduced by nearly 95% over that 24 hour period, and
the cache hit is significantly better. And the TTLs that should be
longer can still be without significant hits to our failover plan.
Failover is a fascinating topic of itself, but for our purposes let's
just ack that it exists. That is taken into account when we chose the 1
hour TTLs vs. the 24 hour TTLs.
If I'm more correct about zombie distribution than you are, then the
largest term in the number of queries per day calculation, the 1*252*A,
might be significantly less because of the distribution of zombiable
computers being concentrated on popular class As.
I've shown above the distribution I really experience, not a theoretical
distribution. Hopefully that will settle any claims of which
distribution is more realistic.
I've included your original calculations for your method below for
reference.
With my method, mask included at the end of the top level TXT, total of
9 records with the same TTL of 1H. The records are fully compiled and
contain only IP4 and redirects.
Total traffic with my method:
24*1*TXT = 24 queries, if the mask is top notch.
24*9*TXT = 216 queries, if the mask is useless.
More likely the actual number of queries is between 24 and 216.
In total, I called ns_resolv between 24*2540*1=60960 times if I the mask
was top notch and 24*2540*9=548640 times if the mask was crap. So the
cache saved me traffic exactly 99.96% of the time, whether the mask was
good or not.
As you can see, there's a huge difference, and most of it is owed to the
fact that the exists are AGAU, even though 96.6% _looks_ like a pretty
high cache efficiency.
Let's keep in mind that we are not comparing the same exact records, but
it shouldn't matter much. If all of ebay's sending IPs can be encoded
in a single A record, you could substitue the lookup for that A record
for the +mx in my sample record. It would still be the same load.
Well, in my analysis above, I reverted to using the same records, and
only comparing the masking scheme. All else being equal, I looked only
at the differences between masking schemes. Comparing different records
is a waste of our time.
As a comparison, and for the record, here are the numbers for the same
record without using any kind of masking:
24h IN TXT "v=spf1 +mx -all"
1h IN MX 10 mailhost
mailhost 1h IN A 1.1.1.1
That's 24*MX + 1*TXT = 25 queries, and calling ns_resolv 24*2540*2
(121920) times with a cache hit ratio of 99.9795%. Note, again, I
didn't include the A record lookup for mailhost, because it wasn't
included in any of the other calculations.
Short records are not relevant for the masking discussion. If a compiler
is somewhere in the loop, this record would become:
1h IN TXT "v=spf1 ip4:1.1.1.1 -all"
This compiled record results in 24 queries, instead of the initial 25.
BUT!!!
In a every-day case where there is no virus, this 24 vs. 25 query
comparison is valid only if the MTA in question receives a lot of email
from the publishing domains (more than 1 per hour) and/or a lot of
forgeries.
But in the case of an obscure little domain that doesn't get forged
much, and which doesn't sent out much email either (say it sends 1 mail
per day to the SPF-checking MTA), the compiled record would generate 1
query per day, while the uncompiled one would generate 2 queries per
day. That's double the traffic. Of course 2 queries per day is not
something to worry about, except that when the domain gets forged once
per day and the forged email gets sent to 20,000 MTAs per day, you now
have 40,000 queries for the uncompiled record, but only 20,000 queries
for the compiled record. That starts to become significant, especially
if your-little-obscure-domain-name-requires-query-packets-that are quite
big.com, as your DNS provider may charge per Mbyte. We've seen some
instances where queries over a monthly limit were being charged $5/MB,
IIRC.
But this is a discussion on the value of using a compiler. Let's make a
separate thread of this too, if you wish to continue.
I don't know about your mail reader, but in thunderbird, the subject of
the emails is so far to the right in threaded view mode that it is off
the screen :)
> The remaining mechanisms
would have to be really expensive (in terms of number of queries and
query cachability) to make masking mean something. The typical case
(legit mail) is made worse by planning to be able to handle the atypical
case (SPF-Doom attack!). If the numbers you've been preaching are
correct, masking may be a good trade off for complex, amplifying
records.
I think my preaching is sound! ;)
But please do shut it down if you see holes in it. Otherwise, all this
preaching would be a waste of my and your time, and we all have other
things to do too, I'm sure... like sleep, in your case ;)
I still think this should be evaluated on a case-by-case basis. Masking
using exists or compiling and using a masking directive can make simple
records worse, especially if they would overflow into multiple records
because of include flattening.
That case of overflow is the only case where masking is useful, so those
are the only cases we should use to evaluate the merits of the proposed
masking ideas.
If no compiler
there is no masking, or you'd have to insert it manually,
which is inconvenient and error prone.
elseif compiler is used
If compiled with cron, or once in a while, -flatten should
not be used. There will be left-over mechanisms whose
resulting IP address may change (administrative gap)
Thus, masks MUST not be added, since while they work initially
they would break the record when the ISP changes their
infrastructure.
If compiled with cron, and -flatten not used, but the record
compiles into a list of IPs anyway (ie, you list no mechanisms
that lie outside your adminstrative boundary), then it may
include masks if useful.
Masks can only be reliably inserted when your record is
completely in your administrative control, as above, or if
the compiler runs as part of the DNS server.
In that case, the record can be safely flattened *only if*
the TTLs of all mechanisms are respected,
including those across the domain boundary. In that case, the
record, and implicitly the mask get regenerated every time
the IP list gets regenerated beause of expired TTLs. So,
the mask always reflects the current record.
Also, there's an additional condition on inserting masks.
A mask may only be inserted if all mechanisms that cannot be
compiled into an ip list (those that use the %{l} or %{i} macros)
are brought up into the top TXT record.
In other words, a mask may only be inserted if the remaining
mechanisms in subsequent redirects/includes contain only
IP lists.
end if.
I'm going to sleep on this for a little while, and see how the exists:
method can be better than the mask method.
Well, the most obvious way :) it is better is that it is implementable
as soon as yesterday, without having to change the spec, redeploy SPF
evaluators to make them mask-syntax aware, install stunt DNS servers or
upgrade DNS software. In fact, if you are using bind9, the $GENERATE
construct allows easy and quick generation of the necessary class A
records without using an SPF record compiler or outside script.
Both mechanisms are implementable now, because my mask is a modifier
that is neither required to be added, nor required to be used by
checkers. It does not *require* any changes, unless you want to be able
to specify long but convenient SPF records that be compiled into IP
lists.
Existin SPF checkers would just ignore the mask modifier.
BTW, I've been referring to doing either of our methods as "masking".
My suggestion uses exists to generate the mask, your's uses a new
mechanism (too bad it's order dependent, otherwise it could be a
modifier and thus deployed SPF evaluators would skip over it -- although
redirect= is order dependent, isn't it?)
No, my mask is a modifier, and is not order dependent. In fact, when
masks are checked, all the masks should be compared, and only if *none*
match the incoming IP, the evaluation can be aborted. If even one mask
matches the incoming IP, it means that that range is used later in the
record, so the additional queries must be done to find out if the IP
matches exactly.
One obvious way is if all
forger traffic came from the same A class net all the time, _AND_ the
specific address was close enough to the servers that the mask would
miss it. [...] It's pretty unrealistic though, given all the
restrictive ifs.
Baring some obscenely large hole on ALL networks, I think past patterns
suggest that those who are most vulnerable, and will remain vulnerable,
are those who sell consumer oriented services (because of the nature of
consumers to not really be security oriented, thus a target for
zombies).
Let's not forget the armies of employees who scour their mailbox first
thing Monday morning in search for jokes (including executable jokes) to
get them past the Monday blues.
My spf-doom virus was coming from an employee of such a company.
But you are right, consumer services are also great candidates as targets.
I think overall, can can only really exclude the secured servers that
provide no user access.
If the mask being ineffective, however it is implemented, is
a concern, avoiding class As that are shared with home subscribers might
be wise. BUT, using either your method or mine, you could implement
masks even more restrictive. This could be as simple as, using my
method:
$GENERATE 2-254 $._spf 24h IN A 127.0.0.1
$GENERATE 2-254 1.$._spf 24h IN A 127.0.0.1
if you want to allow only the 1.1/16 range to be further evaluated. But
at some point, my method becomes diminishing returns because the "normal
case" is not an attack but rather legit email, and things like that only
add to the number of queries performed, of course. So you have to weigh
your chances of getting attacked (and having to deal with the increased
load) and what should be considered "normal operation". It largely
depends on where the attacks are coming from.
So what would the corresponding SPF mask look like?
-exists:{1ir} -exists:{2ir} mx ?
So if you happen to be ebay.com, and you send from the 64-67 class A
networks, you'd have to publish that expensive mask to avoid
RoadRunner's cable modem users who are on nets 24, 65 and 66?
Please look at this specific case more closely. Ebay + RoadRunner make a
great study case, I think.
If there is a long lived, sustained attack, modifying the SPF record to
include masks may be a good short term solution (until the attacks
subside) as a way to control the load that your SPF records is putting
on receivers' and your own systems.
Masks can only be reliably inserted by a compiler. It's just not
practical to install a new DNS server that does compiles, and not even
install a cron job (your system may use the complicated LDAP + DNS + SQL
+ YP alphabet soup) when the long-lived attack comes. And it takes
some planning before the master,authoritative DNS server for a domain
will be screwed around with, especially if you are a DNS provider and
lots of domains depend on your services.
I never meant for masks to be a reactive sh*t containment tool, but a
pre-emptive sh*t preventer tool ;)
Whatever we conclude, I really enjoy these thoughtful discussions.
Heh, incidentally, I'm starting to find them tedious, but overall
interesting -- after all, I'm up at 4am (I'm in central time)
responding, so that must mean something. :)
I appreciate your effort and thoughtfullness very much, but you
shouldn't loose sleep over this.
I myself got up early to see if there were any messages :)
Regards,
Radu
<Prev in Thread] |
Current Thread |
[Next in Thread>
|
- Re: Standard Authentication Query, (continued)
- Re: Re: DNS load research, Radu Hociung
- short circuiting evaluation, Andy Bakun
- Re: short circuiting evaluation, Radu Hociung
- Re: short circuiting evaluation, Andy Bakun
- Re: short circuiting evaluation, Radu Hociung
- Re: short circuiting evaluation, Andy Bakun
- Re: short circuiting evaluation,
Radu Hociung <=
- Re: Re: DNS load research, Stuart D. Gathman
- Re: Re: DNS load research, Radu Hociung
- Re: Re: DNS load research, David MacQuigg
- Re: Re: DNS load research, Radu Hociung
- Re: Re: DNS load research, Andy Bakun
- Re: Re: DNS load research, Andy Bakun
- Re: DNS load research, Frank Ellermann
- Re: Re: DNS load research, Radu Hociung
- Re: DNS load research, Frank Ellermann
- Re: Re: DNS load research, Radu Hociung
|
|
|