Re: Re: DNS load research



Andy Bakun wrote:

On Tue, 2005-03-22 at 12:46 -0500, Radu Hociung wrote:
I like the idea of weights, but it it is a purely academic exercise,because at run-time it is difficult or impossible to calculate the realexpensiveness of a record. The checker can try estimating it, but itwill probably not be nearly accurate enough to be useful. This isbecause of DNS caching.
I was not suggesting that SPF evaluators determine weights at runtime. I
was suggesting that the weights be fixed, relative to each other, as
part of the spec.  The weight of a record would be easily calculable
without needing to actually evaluate it.  Resolving MXs to IPs takes X
amount more work than resolving As.  Count resolving MXs, how ever
complex they may be (an acceptable average is what would need to be
determined), more than resolving As.

That's great, we're on the same page. This idea of weights is a greatstudy tool, as it allows us to compare the relative cost of two queriesthat otherwise look alike. (a:%{i}.domain.com and a:something.domain.comhave very different traffic costs)

It is therefore difficult if not impossible for the checking coderunning on the MTA machines to estimate if the query it is about to dotoo result in a packet sent to the backbone of if it will be served fromthe cache.
Load is the reason that DNS caching and expirations exist.  You've just
shot down your own position of needing a significantly lower limit.  If
you're going to be processing a lot of email, figure out how to make
your DNS cache larger to avoid forced expirations.

I wish I had shot down my own argument, because then I'd be happy toallow a limit of 40. But I don't think I have. Here's why:

In the case I showed, the DNS cluster internal to the ISP's network willstill have to be upgraded by a factor of 2 (if my theory on theincremental load due to SPF checking proves true). This is a real cost.Also, the number of query packets to the Internet will grow by somelarge factor (the ratio of internet traffic to internal traffic remainsthe same. That's what caching does). So both the backbone traffic andthe size of the DNS cluster would grow by the same factor.

That was the case of a large ISP, who realizes great benefits from alocal DNS cache. The ISP sees millions of forgeries from the samedomain, so caching is real savings. But even though the ISP realizes allthis efficiency, his costs will still double.

But for small companies and sites, who see perhaps 2-3 forgeries/domain,caching does not have the same ROI (return-on-investment) as it does forthe ISPs. Thus the incremental cost for small companies is much largerthan the incremental cost for ISPs (comparing percentage cost increase).

I'm taking the 2x factor based on my observations, at a time when SPFpublishers are still very rare, and SPF checkers are rarer still. Ithink more work is needed to calculate the realistic incremental cost.Perhaps someone else can take up that task? I don't think it's work thatcan be done by commitee, though the results can be analyzed andcommented by the community.

The average cost of the backboe traffic is proportional to thecomplexity we allow SPF to have. It can be estimated using weights forthe different queries, but this is a design-time estimate, as atrun-time it cannot be done reliably.
Again, I was not suggesting that the "estimate" happen at run-time.


Perfect.

I still think the macros can cause far more backbone traffic than themechanisms themselves, because macros are much less likely to be cacheable.



This is why I gave exists: a higher weight.  There is greater potential,
as you've already demonstrated, that macros will explode into a large
number of (one-shot) queries that might force expiration of more useful
queries from your DNS cache.  But what I forgot is that someone who
wants to do harm can use macros in any of the mechanisms, thereby
causing the same load.  And thus one of the reasons why I've changed my
mind using non-count-of-query weights.  Although, I suppose if you
counted the use of a macro in a query against as some additive to the
weight, that would help.

But this makes the limiting formula more complex, which we've already
reached consensus on that we don't want :)

Correct, we don't want a complex formula, but the attack strategy Iproposed deserves some thought, I think.


History has proven that if an opportunity of abuse exists, it will be used.

What will we do if it becomes a fad to list the domains we don't like inour SPF record, with a - prefix ? It could be a geeky way to advertiseour beliefs.

Or it may be seen a patriotic act like "list all those chinese domainsthat bombard us with viruses".


Would this not degenerate into WWWWI ? (World-Wide-Web-War-I)

I can and do wish it will not happen, but wishing is not a method. Let'ssee if it's possible, likely, and what the possible outcome will be.

In fact, we also need to discuss the various ways that SPF andSPF-related applications can be abused. Perhaps a different thread ?

I think the current draft is very thin on this topic.


Radu.