Re: Microsoft submitting Caller ID as draft RFC


In <9156B81DAA29204989AD43E88688FAAB01373AD8(_at_)df-lassie(_dot_)dogfood> 
"Harry Katz" <hkatz(_at_)exchange(_dot_)microsoft(_dot_)com> writes:

However, there are important differences too.  

1.  User experience:   SPF checks the 2821 MAIL FROM address, Caller ID
checks a set of 2822 headers.  As I've noted in other posts, Caller-ID
performs validation on headers that are included in the message body
itself, that is, on headers the end user can see.


The 2821 MAIL FROM and HELO strings are visible to end users in the
RFC2822 headers via the Return-Path: and Received: headers.  Granted,
these are not normally displayed, but since they appear near the top
of the headers, they are often found by those that know enough to look
at the headers.

On the other hand, most MUAs that I'm aware of do not display the
Resent-From: or Resent-Sender: headers, and a lot do not do a good
enough job of displaying the Sender:, From:, and the from address
comments fields to prevent social engineering and confusion.

So I think that RFC2821 data, due to their position in the list of
headers, is very slightly more visible to the end user than the
RFC2822 headers that C-ID uses.

I see three ways to make the relevant headers more clear: Change the
MUA to display the critical headers, restrict what you validate to
what MUAs already display, or have the MTA restructure the message to
display the verification results in the body a-la SpamAssassin.

Changing MUAs is a huge task, even for those that can actually cause
the most popular MUAs to be updated.

Restricting the validation to the portions of the RFC2822 headers that
are currently widely displayed makes it much harder to create a
standard that can handle all the special cases that (By "portions" I
mean the comment fields vs actual email addresses.)

Doing a SpamAssassin-like repackaging of the email via mime is
pretty ugly for this kind of thing.


Right now, I tend to think that for RFC2822 validation, it is only
useful to consider From: header (both the comment field and the actual
email address).  It is the only RFC2822 data that is almost
universally display.  I think that any proposal that requires updating
MUAs to display more information is not practical.

To be sure, the Caller ID mechanism could be used to validate the 2821
MAIL FROM. In fact we actually had the 2821 MAIL FROM as the first
identity on our list, ahead of the 2822 headers, in early drafts of
Caller ID.  In the end we dropped it because it either added no useful
information that was not already contained in the 2822 headers or led to
incorrect or misleading conclusions from an end user perspective.  In
other words, 2821 MAIL FROM adds nothing useful so why use it?


I am not convinced that the RFC2821 MAIL FROM adds nothing useful.  In
particular, I think it is very important for email reliability to make
sure a bounce gets generated to email that has been rejected.
Silently dropping email that doesn't pass certain tests is, IMHO, a
very big problem.

On my midwestcs.com system, almost all email either gets rejected
during the SMTP session (via 5xx and 4xx codes), filed in a non-spam
mail folder, or reported to spamcop.net as spam.  I realize that many
ISPs can not afford to run SpamAssassin during the SMTP session like I
do, but that just increases the need for a validated MAIL FROM
address.

2.  Better protection against "phishing":  With SPF, a spammer could
register their own domain name and publish an outbound IP address, yet
still spoof the 2822 From.  With Caller ID, these message headers get
checked.  Furthermore, Caller ID provides an extra measure of protection
for senders who only transmit mail directly to recipients and never to
mailing lists.


I'm probably just not understanding the C-ID spec, but how do you
distinguish between the mailing-list case and the email-forwarder
case?  If I understand the spec correctly, the "directOnly" flag would
cause both to fail.

3.  Fewer false positives:


I would love to see some hard data on the number of FP on RFC2821 vs
RFC2822 data.  I see significant problems with both FPs and FNs on
both.  My gut feel says that RFC2821 data will have few problems, but
that doesn't count for much.

4.  Ease of adoption:  We believe our proposal will be much easier for a
broader range of senders to adopt, including list servers, electronic
greeting and invitation services, and mail forwarding services.


I could be wrong, but I think there are far more mailing lists servers
out there than mail forwarders (excluding purely internal
forwarders).  I think there are far more mail forwarders than
greeting-card type services.  So, if a proposal breaks every single
greet-card site, while breaking only a very tiny faction of the
existing mailing lists, it will be better than a proposal that breaks
even a small (but not tiny) number of mailing lists.

RFC2821 already says that mailing lists MUST use the list admin for
the MAIL FROM.  In practice, it appears that most mailing lists do
this.  Hence, RFC2821 checks will likely not effect very many mailing
lists. 

It is not so clear to me whether large numbers of mailing lists would not
changing in order to comply with C-ID.  If I understand the C-ID spec
correctly, the "responsible domain" is found by checking the following
headers, in this order: Resent-Sender:, Resent-From:, Sender:, and
From:.

I did a *very* quick survey of the ~30-35 mailing lists I'm on.
*None* of them use the Resent-* headers, which the C-ID spec says that
they SHOULD use.  Many do not set the Sender field (e.g. Yahoo Groups,
bugtraq, etc.) and others appear to set the Sender: to the person who
sent the email to the list (e.g. the debian lists).

So, this very limited amount of data on RFC2822 headers seems to show
that either many mailing lists will have to change, or RFC2822
checking can not depend on much other than the From: header.


Again, I want to be clear that I *REALLY* want to see validation of
the RFC2822 data.  I think it is *VERY* important.  I just do not
think I understand the situation well enough to be comfortable any of
the proposals, hence my support of doing RFC2821 data first.

                                                                 Many of
these services today do not use their own domain names in the 2821 MAIL
FROM because they do not want to handle bounce messages.


These services can do many things, but they should not be allowed to
set the RFC2821 MAIL FROM to a domain name that they do not
own/control and force others to handle the bounce message instead.

                                                          In order to
comply with Caller ID all they need to do is ensure that at least one of
the body headers we check has their domain name on it.


I may well be confused here, but doesn't the C-ID spec require that
the highest priority header have their domain name rather than at
least one of the headers?

                                                       Under SPF, these
services will have to modify the 2821 MAIL FROM according to the Sender
Rewriting Scheme (SRS) or VERP, plus handle any bounce messages that
come back.  We believe this is a more extensive change and could place a
heavier workload on these services.  To put it another way, SPF really
means adopting TWO new standards.  Caller ID has no such requirement.


For mailing lists, handling bounces is already a requirement.  For
mail forwarders and greeting-card sites, most LMAP proposals do
require a burden to handle bounces.  On the other hand, this reduces
the burden on other sites that have to handle bogus bounces.

I guess part of the cost/benefit analysis involves the question "Just
how important are bounces in the modern email world?"

5.  Extensibility:  [sniped]


I agree that extensibility is important.

SPF allows extensibility via modifiers, I think that that most of the
other LMAP proposals either have, or could have, similar extensibility.

[list of examples]


All of these can be done via SPF modifiers.

6.  Potential denial of service attack in SPF macro language.  Using
SPF's macro language, in particular the &t parameter, it is possible for
senders to create a non-cacheable TXT record that forces receivers to do
a DNS lookup on every message.  By creating such a record and sending a
high volume of email to a target system, a malicious sender could
potentially tie up substantial resources on that system.


This is a concern that I take very seriously.  The DoS threats are not
only against those that receive the email and do the validation, but
also against the domains being validated.  I think the current SPF
spec does not have enough limits in place.  I think these problems are
solvable though.  I also see DoS weaknesses in C-ID similar to SPF,
although obviously in areas other than a macro-language.

Oh, for those who are not familiar with the SPF spec, the
"macro-language" is extremely limited.  It is really just variable
substitution, with a very limited set of operations.  The operations
are truncating the variable, reversing the data, and specifying the
characters to truncation/reverse on.  There are no loops, ifs,
assignments, etc., much less anything like javascript.  Everything is
very bounded.


-wayne