ietf-mxcomp
[Top] [All Lists]

Re: Differences between CSV and Sender-ID

2004-07-02 03:35:36

Hi Dave,

I'm not exactly pleased to see my post carved up, with a number of 1-3 line responses added for every 1-3 lines I wrote. I feel it is a style that encourages fighting, bickering, and disagreement rather than mutual understanding and consensus-building. However, I don't believe at all that you meant to be rude or anything. Perhaps you interpreted my message as an attack against CSV and reacted defensively?

Anyway, you raise some good points here, so I will attempt to reply as concisely as I can. It's going to appear disjointed but I will try not to break things up too much further than they already are.

--Dave Crocker <dhc(_at_)dcrocker(_dot_)net> wrote:


Greg,


GC> A HELO check can catch some really obvious bad cases (like spam or
viruses GC> using the receiver's own name) and some obvious good cases
(like people we GC> want to whitelist).

With respect to CSV, I think this needs to be phrased quite
differently, even though the semantics will probably seem similar, or
at least not contradictory. The reason for this need is that I think it
leads to a very different perspective on the implications of a CSV
mechanism versus an SPF-like mechanism.

HELO makes an assertion about the operation and accountability of the
MTA. There is quite a bit of history and current use of services that
vet sending SMTP clients and their network operators.

A HELO mechanism check can be used to produce a domain-name based
codification of such checking, rather than requiring that white/black
lists be maintained in terms of IP Addresses. The benefit of having a
domain name base involves all of the reasons we all like domain names
better than IP Addresses, for use by humans.


Without commenting on mechanisms, I totally agree with your explanation of HELO and its significance. I was attempting to keep it a bit simple for the readers. The two paragraphs above are a good explanation as to why HELO is significant, and why checking it (by whatever mechanism) is desirable. All is well.


GC> CSV also uses HELO to tie a reputation to the sending MTA.

The concern about accreditation (of which "reputation" is a subset) is
rather interesting, here. What folks seem to be missing is that ALL
mechanisms that involve acceptance or rejection based on a name or
address has an accreditation component. Accreditation is, in fact, the
acceptance or rejection policy engine.

CSV merely specifies two external standards for such a mechanism.

So, yes, a mechanism that only seeks to detect forgeries does not have
an accreditation mechanism.  But forgive me if I am wrong:  I thought
folks were interested in detecting and preventing spam, and spam is
very, very much about accreditation, not forgery.

Forgery is a current symptom, rather than a core aspect of spam and
virus sending.  Eliminate forgery and there will still be masses of
spam and virus sending.

I had rather hoped we were trying to get at core issues nof fighting
spam, what with the scale of the problem and the cost and delay
inherent in any standards effort.


My statement wasn't meant as a negative. I actually agree with what you said here. CSV hangs reputation/accreditation on HELO, SPF chooses to hang it on another identity. I certainly didn't mean to imply that accreditation is not important.


GC>  This seems to
GC> be based on the assumption that good mail comes from good MTAs, and
bad GC> mail comes from bad MTAs, which some have suggested is not
well-supported.

"Some have suggested" is language that applies to any interesting
topic, for every possible point of view on the topic. Hence, it does
not carry any useful information. (Yeah, I am saying that a bit
sharply. It is a pet-peeve of mine about so-called news reporting and
I really hope the content-free utterance does not seep into serious
technical discussions.)

To move this particular point into something that might be productive,
please refer to the thread "who are we accrediting?" and note John
Levine's posting.  I'll post a response to it.


In the part you quoted, I was trying to point out one area of disagreement, without actually taking sides (I explained my own opinion after the "My opinion" tag), so my apologies for the vagueness.

So, to be clear, I am suggesting that this assumption is not valid or useful. I think the reputation of the MTA is often interesting, but certainly not enough in itself to judge the quality of the mail. But, after reading your message I am starting to think that you don't believe this assumption either, that there are only "good" and "bad" MTAs. In that case, this disagreement is not directed at you, but at other CSV supporters (Matthew and Doug) who have been suggesting that checking HELO against a reputation is pretty much all you need and other proposals that check other identities are worthless, doomed to failure, or both.




GC> I think checking the HELO *alone* is not an adequate solution to the
GC> problem set.

I agree.  RFC2822 author/sender based accreditation is also going to
be needed.  The nature and form of that accreditation is a different
question.


Right...  that's what I was trying to say too :)


GC> If the main thing we want out of MARID is to stop people forging mail

I do not have access to the working group charter as I write this, but
I sure hope that forgery is not the primary concern of the working
group.

Otherwise, there is a rather large community of email users and
providers who are going to rather upset that we spent all this time
and did nothing that is intended to reduce spam.

On the other hand, I could see how "DNS-based MTA authentication" could
cause one to think that forgery is the focus.


Wow, we're interrupting mid-sentence now, I see :) I won't spend much time on this one other than to say: 1. I want to stop spam too, and I think stopping forgery is a necessary but sufficient step.. 2. I honestly believed that stopping forgery was the point of the WG and that stuff that stops spam by other means than stopping forgery would be ruled out of scope, and 3. It looks like you agreed with the important part of my sentence anyway :)


GC> apparently-from and bouncing-to our own domains, a MAILFROM/PRA check
is GC> going to be required.

Some sort of rfc2822 author/sender accredition is going to be
required... in some cases.


Agreed :)


GC> Mechanically, CSV and SPF are both capable of checking HELO.

Mechanically, CSV and SPF are both fruit. But let me tell you, you do
not want to think about or use durian the same way you think about and
use oranges.

However, your statement highlights a deeper problem in most of the
efforts to discuss CSV and SPF differences:  Such efforts are almost
entirely tied to mechanical and syntactic issues and do not focus on
underlying concepts.


Right... That actuall IS what I mean here -- I mean to separate the mechanics of each proposal from the underlying concepts. The assertion I was trying to test is whether the mechanism SPF uses to test PRA, MAIL FROM and HELO is capable of doing the same things the CSV mechanism does.

If it cannot for strictly *mechanical* reasons, I would like to understand what they are. So far the answer whenever I ask this is "Well, you COULD use SPF TXT records instead of SRV records, but why would you want to?" If it's possible to present an end user with one tool that has two applications, that might be a worthwhile goal. Speaking only of the *mechanism* I don't see that SRV records have an inherent advantage over TXT records, or that the underlying concepts of CSV depend on SRV records.


CSV and SPF are fundamentally different pardigms.

    CSV vets an MTA's traffic.

    SPF vets an RFC2822 author/sender's message.

They are orthogonal informational-theoretic areas of consideration.

Where the confusion comes in, of course, is that SPF involves the MTA,
albeit through an indirection.

Let's try for some concise descriptions of the two paradigms:


SPF:

    Per-message MTA path validation, based on Author/Sender
    authorization and accreditations.

CSV:

    MTA traffic validation, based on MTA operator authorization and
    accreditation.


SPF vets an MTA's sending a single message.  It accredits the MTA
based on the RFC2822 author/sender.  While introducing a
path-dependency into the mechanism, it simply defers the hard
question, namely accrediting the author/sender.

CSV vets an entire MTA session.  It accredits the MTA based on the
operator of that MTA.


I don't really agree that CSV and SPF are fundamentally different paradigms. They are different, but I don't think fundamentally so, and I don't think either of them represents a "paradigm" really.

I think of SPF not as a great idea, but as a collection of great ideas. Some of these are:
- A mechanism that maps (domain name, IP) onto (pass, fail, unknown)
- Application of this mechanism to MAIL FROM, to vet a message path (or partial path, when forwarders use SRS) - Application of this mechanism to PRA, to vet a message path (or partial path, when forwarders use recommended headers) - Accreditation can be applied to the domain of any ID that returns pass result. - An ID that returns fail result should be treated as highly suspect and probably rejects. An ID that returns unknown result should not be used to judge a mail as good or bad and the receiver should fall back to other methods.

CSV is also made of multiple great ideas, such as:
- A mechanism that maps (domain name, IP) onto (allowed, disallowed, no_info)
- Application of this mechanism to HELO, to vet an MTA
- Accreditation can be applied to the MTA based on its name, if result is allowed. - An IP address specifically disallowed from using the name claimed in HELO should be treated as MTA-not-grata - An MTA that has no info CSV may check should be rated on other means (e.g. IP) or not at all.


The point of this exercise is to separate the "mechanism" carrying the message from the content and meaning of the message itself. The SRV record mechanism is clever, but I got the feeling from reading CSV documents and speaking to you and other CSV supporters that it is not the main important thing that CSV does.

By suggesting that the mechanisms *could be* compatible, I don't mean to imply that the two types of checks already mean the same thing. They don't. SPF has a couple of modes where it checks HELO, but it lacks an explanation as to why one might want to do that, what the information means, and how to interpret it and act on it.


Why am I so keen to show that one mechanism could be used for both checks? Well, one of the first things that this WG worked on was deciding which identities to check. My understand was, at the time, that there was a pretty strong consensus that we should work on both 2821 and 2822 identities, and I *thought* we had also decided that if we tackle one identity first, we would do so in such a way that the other identities could be checked with the same or similar mechanism.




Current whitelist and blacklist services focus on the MTA network, ie,
the operator of the MTA.  So CSV provides a standardizing mechanism
for existing practise.

The limitations of that practise are demonstrated every day, but so
are the benefits.


That is an excellent point, and I agree.



GC>  - If the MTA name is also used as a HELO name for one of the MTAs
GC>       - In most cases the existing SPF record should be sufficient,
since GC> it probably includes that MTA.

My guess is that you are talking about the narrow case in which the
RFC2822 author/sender has the same domain name as the MTA HELO.  While
a popular scenario, it is a long way from being the ONLY popular
scenario.  And that's the problem. SPF is problematic for a number of
other such popular scenarios.


You are correct, that should have been "sender domain name" not MTA name.

I agree, this is definitely not the majority case. I mention it here because it is really the only case where the SAME name may be used by both email addresses and HELO. If the same name might be used by an MTA and by the RHS of an email address, I think chances are very good that the allowed IPs for both cases will be the same.

Again, this hearkens back to the discussion of which identities we want to be able to validate. At the time, we identified HELO, MAIL FROM and From:/Sender:, and along with the idea that perhaps all three merit checking, we brought up the cases where the same domain name might be used in different contexts. Each context might have wildly different meaning and usage, but where the NAME is exactly the same, the set of authorized IPs would usually be the same or a blend of the two usage sets would be suitable. If I remember correctly, not everyone was convinced at the time that a single set of IPs would always work, so there was some discussion of a "scope tag" of sorts, but I think most of the group agreed at the time that the need for this would be rare.



GC> Semantically, there is some difference in the understanding between
what GC> the CSV check means, and what the SPF+HELO check means.

It is rather more than "some".


Agreed. But if the implication is that they are different enough to *require* different mechanisms, I would not agree with that.

Let me say this again because I think it is important:
THE IDEA OF USING ONE MECHANISM TO VALIDATE DIFFERENT IDENTITIES IS NOT NEW.

As I continue to suggest that CSV *could* be implemented using SPF TXT records, people continue to look at me as if I'm speaking heresy. All I can say is, please review the archives. This same WG agreed that multiple identities are worthy of checking, and if possible they should be checked in the same or similar ways. Did I misunderstand, or have we changed directions on this, or has everyone just forgotten what we talked about for the first month or more?



GC> It would be better to use ?include:comcast.net or ?ptr:comcast.net.
That GC> way the mail from those domains is still allowed, but not
"guaranteed" to GC> be from you.

This begs for an obvious question: What is the benefit to the
anti-spam world of something which offers no guarantees? Is that not
the same as saying "I enforce no anti-spam policies, since anyone can
claim to be part of my domain"? No accountability is a rather serious
deficiency.


To quote Douglas Adams, "We demand rigidly defined areas of doubt and uncertainty!" :)

Seriously though, the "unknown" state was put in there for a reason. In an ideal world, all my users would phone home and submit with SMTP AUTH and all our mail would go out the pre-defined block of IPs. But, some domain owners might want partial coverage, and might need some usage cases to be supported in "legacy mode" for a while. If a domain owner is not 100% sure he has rounded up all the roaming users, he may choose to write ?all at the end - in which case forgeries would not be stopped, but the +entries in the list can still be used to invoke reputation and whitelisting. If all the roaming users happen to be on comcast.net, a record with ?ptr:comcast.net -all is much better than ?all -- possibly enough to make spammers/forgers move on to the next target.

In other words, the "unknown" state is a feature, not a bug. If you don't agree, fine, don't use the feature. Your characterization of this mode as a "deficiency" is uncharitable and seems to contain a high FUD to fact ratio.

If you had not taken the sentence out of context, my original intent would be a bit clearer -- I was actually responding to some other FUD based on a wrong understanding of SPF (or intentional misreading or other straw man). The example given by Doug and Matt both was "Well what about a domain that publishes include:comcast.net? That means anyone on comcast.net could HELO as my own name!" Yes, and this would be a mistake on the part of the domain owner; they are in effect saying "We trust comcast.net to not forge mail from us or otherwise use our name improperly."



GC> If the result comes back unknown, you can't attach reputation
GC> or whitelisting to that transaction, you just have to proceed in
"legacy" GC> mode.

And the value-add of SPF, in this scenario is what, exactly?

What does the administrator of the domain and/or the operator of the
receiving SMTP client get for their effort?


See above regarding FUD.

I will note also that despite disparagement pointed at the "unknown" mode of SPF, CSV also has a de-facto "unknown" mode - you can just choose not to publish any records at all for that particular name. I would assume that reputation would not attach in this case either.



Is it clear to you that CSV has definite security advantages
over SPF/Sender-ID?


GC> There is general agreement that the smaller problem
GC> of HELO checking

"smaller problem"?

I hope you do not mean that identifying spam spigots is a small
problem or that doing it will be a small benefit.

That is one of the things CSV is useful for, that SPF is not.  Entire
networks of compromised machines can be blocked with a single
accreditation entry, no matter what the domain names they use for their
rfc2822 author/sender.


Actually I was not referring to my own opinion as "general agreement" -- I was referring to the decisions of this WG as to which identities should be checked. I believe it was agreed that 2821.MAIL FROM was most important, followed by 2822.From/Sender, and 2821.HELO was the least important of the three.

I DO agree though; identifying spam spigots is a big problem, and doing it is a big benefit. You have made a good case for HELO checking. I'm not suggesting that HELO checking shouldn't be done, and you're not suggesting that it's a total solution in itself that trumps others. We may not be on the same page but I think it's in the same book :)




GC> CSV may have a better security story, but I believe this is a direct
result GC> of deciding to include fewer features and less flexibility.

Methinks there is a lesson in protocol design, here.


GC> Regarding DDOS concerns, I think they can be solved by placing some
limits GC> on the amount of recursion possible and the total number of
queries needed GC> per mail message, and that should satisfy most
concerns.

Offhand, I am not sure what you mean and I am certain I do not
understand how it pertains to protection against DDOS attacks.


Sorry, I didn't mean "[all] DDOS attacks" -- this was a specific reply to a specific concern in SPF.

I don't know why Matt and Doug chose to say over and over again how nasty and yucky and vulnerable SPF is, citing this as a reason why CSV is cool and wanted and necessary. I don't think SPF and CSV are mutually exclusive and I don't think the "air of competition" serves any of us well. I think CSV contains great ideas and so does SenderID.



GC> I think there is enough consensus in the group that we need to
protect PRA GC> and/or MAIL FROM,

There is agreement that we need a mechanism that identifies and
accredits rfc2822 author/sender IN SOME CASES.


No comment at this time, Senator. :)


GC> and that HELO is of secondary importance.

I'm not sure whether you noticed, but there is a rather different tone
in the comments about HELO checking now than there was a month or so
ago.


I noticed. CSV has done a lot to bring HELO into the spotlight. This is a good thing. As long as CSV isn't trying to elbow other proposals out of the way, I don't have a problem with it.

In fact, Unified SPF is based pretty strongly on my efforts to get HELO placed more prominently on SPF's radar screen. I have gone from not taking HELO seriously to actively preaching the gospel of HELO to spf-discuss and #spf on IRC.

I don't think HELO has eclipsed PRA/MAIL FROM/SUBMITTER in importance or utility.



GC>   Not everyone
GC> agrees with this, but I think a majority of folks think that HELO
checking

I am confused.  Were the chairs asking each of us to perform a rough
consensus assessment of the working group?


I was referring again to the rough consensus reached at our first-phase milestone. I believe it was agreed that 2821.MAIL FROM was most important, followed by 2822.From/Sender, and 2821.HELO was the least important of the three. Perhaps I misremember, but that's what I thought we said.


GC>   So, if we are going forward with PRA/SenderID or
GC> something like it, it should be easy enough to adapt it to HELO
checking as GC> well.

I'm sure we all look forward to the specification that satisfies your
expectation.


Being worked on. I think you will be pleased. As I said before, if Unified SPF borrows some ideas from CSV, please consider that a form of flattery :)

--
Greg Connor <gconnor(_at_)nekodojo(_dot_)org>