ietf-mxcomp
[Top] [All Lists]

Re: Differences between CSV and Sender-ID

2004-06-30 12:48:20

On Wed, 2004-06-30 at 10:15, Greg Connor wrote:
Thanks Andrew.  Let me try to answer by way of summarizing what I think the 
current areas of disagreement are.  I think that the differences are minor 
and that we will all be able to come to common ground soon.

--Andrew Newton <andy(_at_)hxr(_dot_)us> wrote:
    Do you appreciate the difference between a HELO check vs. a
MAILFROM/PRA/SUBMITTER check?

My understanding of the issue:

A HELO check can catch some really obvious bad cases (like spam or viruses 
using the receiver's own name) and some obvious good cases (like people we 
want to whitelist).  Also, some folks would like to protect their own name 
from being used in HELO by other MTAs (so that it won't appear in Received: 
lines and cause them to get misdirected abuse reports) -- I think there is 
general agreement that this doesn't happen often and probably isn't core to 
MARID, but it's certainly something that some domain owners want.

There is no means of stopping the problem without first identifying the
problem.  Any entity can be held accountable provided there are accurate
means available.

CSV also uses HELO to tie a reputation to the sending MTA.  This seems to 
be based on the assumption that good mail comes from good MTAs, and bad 
mail comes from bad MTAs, which some have suggested is not well-supported.

Through use of dynamic lists identifying abusive MTA servers (this does
not include major ISP servers), more than 80% of the abuse is blocked. 
From this, I must suspect this suggestion is in error or misunderstood. 
If this list could be fully vetted, it could be made available.  Hence
the efforts directed at CSV. : <

My opinion:

I think checking the HELO *alone* is not an adequate solution to the 
problem set.  I don't think CSV alone is enough to be effective against 
spam coming from big ISPs, where good and bad mail may flow from the same 
MTA.

This underestimates the potential enabled by CSV.  If CSV is acted upon,
the next battle will be to get ISPs to utilize SAP. MAILFROM/PRA/
SUBMITTER offers NO help whatsoever in this area however!  An accurate
accreditation system would help convince ISPs it would be in their
interest to implement SAP systems.  With SAP, the abuser can be
identified and disabled to protect their accreditations. 

If the main thing we want out of MARID is to stop people forging mail 
apparently-from and bouncing-to our own domains, a MAILFROM/PRA check is 
going to be required.

First, you need something that can scale.  MAILFROM/PRA/SUBMITTER does
not scale.  Fenton's "Identified Mail" does.  If ensuring mail identity
were critical, then an end-to-end solution that can reasonably ensure
the identity of the sender is needed.  MAILFROM/PRA/SUBMITTER will be
defeated owing to the general lack of SMTP security and lack of policy
enforcement as to make such assurance irresponsible.  CSV would allow a
measure to judge the level of SMTP security that can not be deduced
without accurately identifying the sending SMTP.

Do you understand the semantic differences between a CSV check on
HELO and an SPF/Sender-ID check on HELO?

My understanding of the issue:

Mechanically, CSV and SPF are both capable of checking HELO.

As data to comprise such checks are based upon a wholly different
paradigms, these checks do not provide comparable results.  CSV answers:
"This domain is administratively accountable for this SMTP outbound
server."  SPF answers: "This matrix of domains employ this server." The
domain administratively accountable remains unknown with SPF.  

In the current core Sender-ID draft the only mention of EHLO is as
follows:
: Is an SMTP client authorized to use a particular domain name in its
: SMTP EHLO command?  [CSV] attempts to answer this question.  It
: suffers from the fact that the EHLO name has a tenuous relationship,
: at best, with the contents of any mail message.
 
Speaking only about the mechanisms themselves, here is a quick overview:
 - If the HELO name is different from the base domain people use for email 
addresses,
 - you need an SPF record for each MTA name (HELO name) in addition to 
the mail domain's SPF record.

Why add this back into the "core" proposal?  CSV and Sender-ID are
totally independent functions that need totally different information.

 - CSV is similar - you need a SRV record for each full name used in 
HELO - though since it doesn't check any other identities, there won't be a 
CSV record for the mail domain (unless it is a negative record saying 
"nobody may use this domain for HELO")

 - If the MTA name is also used as a HELO name for one of the MTAs
 - In most cases the existing SPF record should be sufficient, since 
it probably includes that MTA.

The fact that the MTA may exist as a subset of the SPF information
overlooks that this information is obtained using different labels and
that being within a subset of the SPF list offers nothing useful for
accreditation. Keep CSV "as is" and keep this check excluded from
SPF/Sender-ID.  Treat these functions as orthogonal.

(If not, for example if the machine calling itself "example.com" is a
web server and not one of the MX mailers for the domain, then the
bounces or notifications coming from this machine are already blocked
by SPF).

If the SMTP server does not belong to the domain, then NO mail will be
accepted per CSV.  Accreditation could offer extended permissions to
enable a greater diversity of mail to emerge, but as SPF/Sender-ID does
not scale, a comprehensive list of allowable domains over any SMTP
server may require some other out-of-band system, and not DNS.  A
conventional method would be to publish a _service._tcp.domain SRV
record that would locate such an out-of-band service.

 - If the same name is used for both HELO and email(_at_)domain, the SPF 
record should be a union of the two policies.

This overlooks the purpose of the CSV mechanism entirely and overlooks
the protections offered by a much simpler record.  Again these to
schemes do not offer the same information.  Just because CSV may be seen
as a subset of SPF, such a record may be located with a different label
but must contain different information (less, far less information.)

All machines able to use this name in HELO and all machines able to
send outgoing mail from this domain should be listed.  It has been
proposed that SPF add a macro that sites could use to post a different
HELO policy than its PRA policy, but in practice most users will just
go with the merged policy (if they are different at all)

Why preface the use of a simple mechanism with the adoption of an
incredibly complex scheme that is yet to be fully documented in any
Internet Draft.  Now you wish to have a macro to post this information? 
Why add such a significant amount of time to an issue that can and
should be addressed separately?   

Semantically, there is some difference in the understanding between what 
the CSV check means, and what the SPF+HELO check means.  The big difference 
between CSV and SPF+HELO is that CSV's supporters have attached a large 
amount of significance to the record, since it is ONLY used for HELO names, 
it seems a lot clearer that "These are the servers I control and I'm the 
best domain to address your problems to".  This is more about what you read 
into the result when you do the check, and not really a factor of the 
mechanism.

It would seem you have completely overlooked the need for a valid
identity for accreditation.

CSV doesn't have an "unknown" mode, and I think some CSV supporters have 
said this is a good thing.

CSV does allow "no statement" to be made regarding the sending SMTP
server.  How this gets interpreted is left to those that make policy.

My opinion on this:

In the case where you want to permit users to send outgoing mail from 
another domain, but you don't control their servers and you don't 
completely trust them, I would strongly recommend admins to use "unknown" 
status (like ?include or ?ptr).  If you add to your SPF record using 
+include:comcast.net, you are in effect saying "Anyone at comcast.net can 
claim to be me and the mail is guaranteed not forged".

It would be better to use ?include:comcast.net or ?ptr:comcast.net. That 
way the mail from those domains is still allowed, but not "guaranteed" to 
be from you.  If the result comes back unknown, you can't attach reputation 
or whitelisting to that transaction, you just have to proceed in "legacy" 
mode.

That is an important part of why I believe most domains can use the same 
SPF record for HELO purposes and for PRA/MAIL FROM.  I guess trying to 
mingle the two modes without explaining the ? vs. + is probably confusing 
to people :) but I still believe that the same tool can be used for both 
jobs.

Why mingle two records that are naturally accessed using different
labels anyway?  You have not justified this merger.  Saying the SPF
syntax can become more obtuse to _help_ differentiate the "group" that
_may_ be accountable does not improve the use of CSV.  CSV is designed
to be resilient and lightweight.  

If people are really worried about others forging their name in HELO, but 
don't want to apply stronger protections to their PRA/MAIL FROM with the 
same name, there will be a macro-based way to give a stronger HELO policy 
than PRA/MAIL FROM policy, but I think HELO forgery is not enough of a 
problem for this to be interesting to most users.

Add another layer to the cake, or is it stuff another nut into the
cheek? : )

Is it clear to you that CSV has definite security advantages
over SPF/Sender-ID?


My understanding of the issue:

The recent discussion over possible DDOS involving SPF doesn't really have 
to do with HELO checking specifically.  CSV supporters assert that because 
it doesn't support redirection, includes, macros or exists mechanisms, that 
this is a good thing.  There is general agreement that the smaller problem 
of HELO checking can probably be solved with a smaller set of tools, but 
folks differ on whether they want two sets of tools, or one tool capable 
for both jobs.

Leatherman or hammer? : )

Therefore we should separate the issue of "whether SPF has problems" 
completely from the question of "whether CSV has something SPF doesn't". 
If SPF has problems, they should be addressed.  If CSV has features SPF 
doesn't have, the group needs to decide whether to advance CSV as well, or 
try to adapt SPF to include those features.  There is some advantage to 
end-users if we are able to produce a single RFC that can address both 
problems.

Each of these two problems represent significantly different time
lines.  There is no technical merit combining these two functions where
even the current MARID "core" document has completely excluded
consideration of the EHLO domain.

My opinion:

CSV may have a better security story, but I believe this is a direct result 
of deciding to include fewer features and less flexibility.

Regarding DDOS concerns, I think they can be solved by placing some limits 
on the amount of recursion possible and the total number of queries needed 
per mail message, and that should satisfy most concerns.  We should 
probably also do a bit more testing to see what such an attack might do, at 
the very least because then we can recognize it when it happens and we know 
what to do :)

I think there is enough consensus in the group that we need to protect PRA 
and/or MAIL FROM, and that HELO is of secondary importance.  Not everyone 
agrees with this, but I think a majority of folks think that HELO checking 
by itself is not enough.  So, if we are going forward with PRA/SenderID or 
something like it, it should be easy enough to adapt it to HELO checking as 
well.  I don't believe there is enough support in the group to adopt CSV as 
a second proposal, if SenderID can be easily modified to do both jobs.

If we separate the questions of "SPF is bad because X" and "CSV is great 
for Y", we will probably decide that SenderID needs some minor changes to 
be fixed, and that CSV provides not enough incremental value to proceed as 
an independent proposal.
<snip>

Here we differ greatly in opinions.  I do not think that CSV by itself
is a complete solution.  CSV with accreditation is a potent solution
that has much greater promise.  I have great concerns regarding
expectations of using DNS to publish a correlation of 2822 identities
with all servers allowed to carry such messages.  DNS was never designed
to have a simple query offer such a comprehensive result.  This is
venturing into regions where there are many looking for a chink in the
armor.  Why expose yourself to harm if it is not needed?  A security
concern offers great justification for keeping these two proposals for
two separate problem sets, separate.

-Doug