Re: DNS Loading Comparison

Todd,

Thanks for your very informative reply, and again, sorry for using rr.comas the hypothetical example. You have one of the best setups I've seen fora large domain, so that is why I keep coming back to it.

My scenarios below assume the perpetrator's objective is spamming, notDoS. A DoS attack might be devilishly different.

The one big departure from your current setup is my Scenario E2, whichassumes you are using CSV, not SPF. With CSV you must authorize each andevery server with its own SRV record. Hence, my assumption of 1000 uniqueDNS records filling the cache at each of 150,000 MTAs (including negativeresponses). I have also assumed that a query for SRV records from apurported server like serv138.austin.rr.com would be handled by a slave atrr.com, or the number of queries would actually be even larger.


Gotta go now.  I'll have more later.

--
Dave
************************************************************     *
* David MacQuigg, PhD      email:  dmquigg-spf at yahoo.com      *  *
* IC Design Engineer            phone:  USA 520-721-4583      *  *  *
* Analog Design Methodologies                                 *  *  *
*                                   9320 East Mikelyn Lane     * * *
* VRS Consulting, P.C.              Tucson, Arizona 85710        *
************************************************************     *

At 07:42 AM 4/13/2005 -0400, Todd Herr wrote:

On Tue, 12 Apr 2005, at 19:36, David MacQuigg wrote:

> It looks like SPF has the potential of being 1000X more efficient than CSV
> in both number of DNS queries and cache sizes.  I tried to bring this up on
> the CSV mailing list, but they don't want to hear it.  So I bring it up
> here, in hopes of getting some check on my assumptions.
>
> Assume:
>        2,000 zombies, widely distributed
>       50,000 emails from each zombie
> 100,000,000 recipient addresses, widely distributed
>      100,000 recipient domains
>            3 hops from sender to receiver
> Then:
>    2000 senders --> 3 hops -->  100,000 receivers
>       approx. 150,000 MTAs needing to authenticate
>
> Scenario E1:  All DNS queries to rr.com
>      Total 150,000 queries, cached for 48 hours
>
> Scenario E2:  DNS queries to 1000 servers, widely distributed
>     Typical server:  serv138.austin.rr.com
>     150,000 MTAs x 1000 servers = 150,000,000 queries !!
>     Client caches are 1000X larger, and 1000X less
>       likely to hit.

Since you're using rr.com, I'll respond.

I'm not sure I follow your math here.  While I disagree with your
assumption that there would be 100,000 recipient domains as the
target of these widespread zombies (I think the number would be
much, much smaller, because there's more bang for the buck in
trying to hit 1,000,000 mailboxes at each of 100 large domains
rather than 1,000 mailboxes at each of 100,000 random domains),
I'll go with that bit for now.

First:

> Then:
>    2000 senders --> 3 hops -->  100,000 receivers
>       approx. 150,000 MTAs needing to authenticate

I think this assumption to be wrong; I think the reality would
be:

>    2000 senders -->  100,000 receivers
>      at most 100,000 MTAs needing to check SPF records

Zombies are more likely to attempt direct connections to the
target domains, rather than routing their traffic through the
local SMTP servers.  Were they to all be Road Runner customers,
and were they all to route through our SMTP servers, they'd be
limited to a total of 2,000,000 recipients, as we limit each
residential IP address to sending to 1,000 recipients per day.
(Our outbound servers don't do SPF checking, and they'd only
represent one hop, not three, in this chain, anyway.)

So, we presume that all 100,000 MTAs do SPF checking, and that the
email connections would not be rejected out of hand for reasons
other than SPF records, before the SPF check was even necessary.

For instance, all Road Runner residential customers now have IP
addresses whose PTR record ends in 'res.rr.com'.  This PTR record
indicates that this is residential dynamic space, and while our
AUP doesn't forbid our customers from running servers in our
dynamic space (cable modems provide a persistence of IP address
assignment due to their always-on nature that can last for
months, meaning that 'dynamic' is a relative term) there are
sites that will reject any connection from our residential space.
This rejection would take place before the "MAIL FROM" part of
the SMTP transaction, thus eliminating the need for SPF checking.
Moreover, zombie infected computers tend to end up on public and
private block lists, so even if the target sites don't reject the
connection based solely on its being from dynamic space, we must
presume that some number of these connections would be rejected,
again before the "MAIL FROM" part of the transaction, due to
their being present on one or more block lists.

But let's presume that all 100,000 domains aren't rejecting from
dynamic space, and that they're not using any block lists that
list any Road Runner IPs with zombies.  (We're going to assume
that the three dozen or so Road Runner-managed SMTP servers will
not be infected by zombies, since they don't run the OS that
tends to be infected with zombies.)

Now, there are roughly 75 legitimate DNS domains ending in
rr.com, including the domain rr.com.

In your scenario E1 above, I'm presuming that your assumption is
that each of the 50,000 emails from each zombie would purport to
be from "user(_at_)rr(_dot_)com".  Each of the 100,000 target domains would
issue, on the first message, a query to one of the four
authoritative servers for rr.com (assume 25,000 queries per
server; it represents well under ten second's worth of work for
each of them) for the SPF record for rr.com.  This SPF record has
a TTL of one day, so each of the 100,000 target domains would
cache the record for one day, and all remaining SPF queries would
be disbursed to the 100,000 local (sets of) DNS caching servers
that are supporting the inbound mail servers at those domains.

In your scenario E2 above, let's presume that each zombie sends
email from 'user(_at_)foo(_dot_)rr(_dot_)com'.  Now, for those instances of
'foo.rr.com' where 'foo.rr.com' is not a valid DNS domain, I
presume that the SMTP transaction would be terminated by the
receiving host during the MAIL FROM part *before* SPF checking is
done; most modern MTAs have the capability of validating the
sender domain mentioned in the MAIL FROM part of the transaction,
and any that will reject due to its not existing would do this
before checking SPF records, one presumes, or at least as a side
effect of the check.  That is, if a query for the TXT record for
a domain returns NXDOMAIN, the mail would be rejected not on SPF
grounds, but rather on 'domain does not exist' grounds.  (Do MTAs
that check SPF do a separate query for the SPF record at this
point, or how is it typically handled?)

But let's presume that all 50,000 messages from each zombie are
sent with 'user(_at_)foo(_dot_)rr(_dot_)com' being valid rr.com DNS domains.  
The
75 valid domains have their DNS authority spread roughly equally
among nine data centers, eight of which has two authoritative
servers, with the ninth having four, but let's call it two each,
to make the math easy.  Let's presume worst case, with each of
the 100,000 target domains getting email "from" each of the 75
valid domains.  Let's go for the worst case, and say that none of
the 100,000 target domains have any of the 75 SPF records
currently cached.  This means that our authoritative servers
would have to absorb 7.5 million queries (41,667 per each of the
18 servers; still less than one minute's work per server); the
remainder of the SPF queries would then again be disbursed to the
local (sets of) DNS servers supporting the mail servers at the
target domains, because they would be caching the SPF records.

Am I missing something fundamental about SPF here?  Isn't SPF
record checking all about the sender domain (i.e., the domain
mentioned in the MAIL FROM portion of the SMTP transaction) and
whether or not the connecting IP is allowed to send email for
that domain?  That is, isn't it the MAIL FROM part of the
transaction that triggers the SPF check, not the EHLO/HELO part?

--
Todd Herr
Senior Security Policy Specialist/Postmaster      V: 703.345.2447
Time Warner Cable IP Security                     M: 571.344.8619
therr(_at_)security(_dot_)rr(_dot_)com                           AIM:  
RRCorpSecTH