ietf-mxcomp
[Top] [All Lists]

Some thoughts on the costs of Block vs Factored queries

2004-03-10 21:08:25

On Wed, Mar 10, 2004 at 08:34:57PM -0600, Gordon Fecyk wrote:
| 
| What of the practicality of IP+domain queries, where each e-mail causes a
| query, vs domain-only queries where maybe the domain's queried once in a
| while with larger responses?

Block records require more parsing, but subsequent lookups suffer zero
marginal DNS cost.  Factored records need slightly less parsing, but
each new negative means a new DNS lookup.

Today, a single spam run of a million messages may come from ten
thousand hosts and may forge ten thousand domains.

If we focus on the "ten thousand hosts", block records look less expensive.

If we focus on the "ten thousand domains", factored records look less expensive.

It's all a matter of perception :)

But we should keep in mind that either way the extra DNS traffic will be
cheaper:

- for recipients, than receiving the spam, and
- for forged sender domains, than dealing with the callback
  verifications and bounce messages.

There are a few ways the costs of DNS lookups can be classified:

 - initial vs marginal
 - cached vs uncached
 - positive vs negative lookup result

The first time a domain sends mail, the domain-specific record is
fetched. This is the INITIAL DOMAIN COST which benefits from resolver
caching.

The next time the domain sends mail (legitimately), the cached record is
used to obtain a positive result. In most cases no additional lookup
needs to be done, so there is zero POSITIVE CACHED COST.

Suppose a forged message comes in. The lookup will be negative.

With factored records, negatives always cost one additional DNS lookup
per new negative IP, thus the NEGATIVE UNCACHED LOOKUP COST is deemed
"high".

With block records, negatives don't cost anything because the entire
positive space has been described up front.  This is a big win, at the
expense of per-user and per-p granularity.

With a combination of block and factored records, negatives usually
don't cost anything because positives are described up front, unless the
domain has used a macro to set up a per-user or per-IP exemption.
Then each new negative costs one additional DNS lookup. But most domains
are not expected to do this.  So the cost is variable.