ietf-mxcomp
[Top] [All Lists]

Re: Forging (was Re: Differences between CSV and Sender-ID )

2004-07-07 18:20:27

On Wed, 2004-07-07 at 12:31, Alan DeKok wrote:
Douglas Otis <dotis(_at_)mail-abuse(_dot_)org> wrote:
Could you define what you view to be forgery?

  http://www.ietf.org/internet-drafts/draft-irtf-asrg-lmap-discussion-01.txt

...
2.1. Unauthorized use of a domain name as "forgery"

   In the context of LMAP, SMTP "forgery" is defined as:

        SMTP Forgery: Use of a domain name in the argument fields of
        SMTP EHLO/HELO and/or SMTP MAIL FROM, by an SMTP client, when
        the owner of the domain name did not consent to that use of
        their name.
...

From this definition of forgery, it would appear you are not referring
to Sender-ID.  This is important as Sender-ID is changing the scope of
this protection.  The ASRG definition was based on an effort to reduce
mail traffic by examining RFC 2821 information. Sender-ID uses RFC 2822
information and, as such, will not impact mail traffic.  It also
currently allows extensions beyond current definitions which impacts
this concern raised regarding DNS overhead.  This change in the
definition of forgery changes the benefit equation, as there will be no
relief from accepting the mail to offset the load placed upon the DNS
server.  You should consider revising this draft before using it as a
reference. 

Listing services are often more robust than a typical DNS server.

  If a DNS server holding MARID information isn't robust, then it will
most likely also fail for MX records, in addition to MARID records.

The rate that a Sender-ID SMTP receiver queries DNS versus that needed
to find an MX server by the SMTP sender are significantly different in
terms of both the number of queries and the serial sequence required. 
The loads and delays are not comparable to allow such an analogy to
dismiss these concerns. 

  I don't see how "robustness" matters more for DNS when it contains
MARID records than when it doesn't.  Sure, more records are being
looked up in DNS, but many MTA's already do MX lookups when receiving
mail "from" a domain, in an attempt to implicitly discover the domain
owners intent, even when there's no standard saying that they should
do this.

Scale and Scope

The complex linked nature of these records becomes important as this is
indicative of limitations Sender-ID has with respect to scale.  An MX
record only refers to a set of hosts that "receive" mail for a domain. 
As SMTP allows this mail to be relayed, there may be many times this
number of hosts that "send" mail for the same domain.  In addition, all
other domains that may originate mail on behalf of this domain are to be
expressed by this Sender-ID record set.  In addition, these records also
reflect other domains that may also share these hosts.  An MX record is
never expected to be so expansive in scope nor is comparative to the
potential size of such a response.  This still excludes the "added"
features.  : 0
 

The nature of a listing service returns a single record in response
to a single query. Do you see this model being changed with
Sender-ID?

  This is explained in:

http://www.ietf.org/internet-drafts/draft-ietf-marid-core-01.txt

Pg. 18:

5.4 Recursion Limitations

   Evaluation of many of the mechanisms in section 5.1 will require
   additional DNS lookups. To avoid infinite recursion, and to avoid
   certain denial of service attacks, an MTA or other processor SHOULD
   limit the total number of DNS lookups that it is willing to perform
   in the course of a single authentication.  Such a limit SHOULD allow
   for at least 20 lookups.  If such a limit is exceeded, the result of
   authentication MUST be "hardError".

   MTAs or other processors MAY also impose a limit on the maximum
   amount of elapsed time to perform an authentication.  Such a limit
   SHOULD allow at least 10 seconds.  If such a limit is exceeded, the
   result of authentication SHOULD be "transientError".

   Domains publishing records SHOULD keep the number of DNS lookups to
   less than 20.  Domains publishing records that are intended for use
   as the target of "indirect" elements SHOULD keep the number of DNS
   lookups to less than 10.

It would appear if there is a dropped query, it may result in a
"Transient Error" after the message has been fully received.  It would
also indicate delegation to other domains may easily result in a "Hard
Error" because someone somewhere added another delegating indirection. 
What average number of indirections per record is required before hard
errors are produced?  

  My reading indicates that multiple DNS queries may be performed to
discover the location of one record, but the intention of the draft
appears to be that records should often be obtained via one query.

What leads you to this conclusion?  I see the complexity of these
records growing as a defense against abusers when initially expressed 
as an "open" list.  Closing the list will require more information that
must be comprehensive.  The end result would be difficult to assess
while many records are simply left "open" initially.

  The information in the record may indicate that other DNS queries
may be performed (e.g. MX).  Again, do you read the draft as saying
otherwise?

There are many possible indirections that may be asserted by the
Sender-ID record. Each of which may result in a sequential query. 

  Many MTA's already do multiple DNSBL lookups.  Do you see that
multiple DNSBL lookups by an MTA are substantially different/better
than one MARID record, possibly requiring multiple lookups?  Do you
see that MX lookups by existing MTA's are substantially
different/better than one MARID record, possibly requiring multiple
lookups?

Multiple RBL lookups are often done in parallel and each result in a
single response.  Many wait for a portion of these RBL queries to return
before proceeding, and do not expect all to return.  Again, these RBL
services are likely much more robust than the typical DNS server.  As I
said, there is no reason to expect the lookup of an MX record to entail
anywhere near the same amounts of information as that returned by the
Sender-ID record set.

DNS routing information is normally obtained at a connection rate as are
queries to listing services.

  I'm not sure what you mean by that.  I'm not even sure I can parse
that sentence properly.

  To take a wild guess, MARID lookups happen only when there are SMTP
connections, therefore any lookups happen at a similar "connection
rate" as queries to listing services.  The constant may be different,
but the dependency on connections is the same.

Sender-ID record sets are examined for every message whereas the listing
service record happens once per connection.  Much abuse happens with
long-term connections that often send a large series of small messages. 
In such a case, the difference in magnitude with respect to DNS query
rates for Sender-ID compared to finding MX records or obtaining a
listing service record could be many orders of magnitude higher.

  Or, do you see Sender-ID as having an amplification problem?  e.g.
If MARID queries increased super-linearly with the number of
connections. If so, it would be a serious flaw in the proposal.

Serious indeed.

  Since the distribution of SMTP traffic to/from most sites is
non-linear across domains, and DNS information is cached, I would
expect that MARID queries would increase linearly with the number of
unique domains used in SMTP conversations, but (often) sub-linearly
with the total number of SMTP connections.

But you are not limited to a single domain with abusive mail.  Often
abusive mail uses a wildcard technique to obfuscate their identity.  As
a wildcard record is superseded by real records, caching the wildcard
would be dangerous.  Our clever abuser may use a random fictitious
sub-domain.  Disaster!  Even worse, an innocent domain publishes an open
list using a wildcard.  This would be better than using someone else's
credit card.  The hapless domain may wonder what just happened to their
network traffic.  

 Isn't the information for Sender-ID obtained at a much higher than
these routing functions you compare it to?

  ... much higher... what?  There's a word missing.

  I *think* you're saying that Sender-ID requires more lookups than
are currently required.  This isn't news.  For details as to the cost
of these lookups, see the list archives.  They contain posts from
others with quantitative summaries, describing the extra DNS costs of
MARID, and concluding that those costs are minimal.

These conclusions vary widely.  There also seems to be a reluctance to
review the issue of suitable scale.


On Wed, 2004-07-07 at 2:37, Alan DeKok wrote:

wayne <wayne(_at_)midwestcs(_dot_)com> wrote:
Nothing proposed on this WG is going to stop spam.

  No one is claiming it will.  Anyone claiming that MARID will stop
spam should be educated as to how it works.
<snip>

I will say that Sender-ID will not affect the amount of spam that people
must sort and delete.  Authorization and Authentication of the sender
SMTP client used together with accreditation will impact the amount of
spam seen.  Such will also enable enforcement whereas Sender-ID will
not. 

-Doug