RE: getting 2822 protection as well as 2821 protection

From: Meng Weng Wong
Sent: Monday, April 05, 2004 4:09 PM


Whenever we talk about doing SPF on the RFC2822 "From:", people counter
with the argument about mailing lists, which have different 2821 from
2822.

And it is true that there are many legitimate occasions when the 2822
"From:" differs from the 2821 return-path.

But the interesting thing is, those occasions tend to be recognizable.

PHB has been proposing on the MXCOMP list that if 2821 does not match
2822, the MUA should put up a red flag.

I think this is a brilliant idea, because it gives receivers something
they can comprehend: if it's a mailing list message, they don't mind the
red flag, but if it's claiming to be from eBay, they should be
suspicious.


I've been working on the same problem from the standpoint of Signed Envelope
Sender (SES).  For others that may not be aware, SES is the signing all
outgoing mail with an address similar to or identical with the SRS0 address
and having the receiving site verify the sender by doing a CBV.  During the
CBV, the sending site validates the hash to make sure the message came from
its MTA.  Some people have called this "Universal SRS", but Meng suggested
that this be dubbed Signed Envelope Sender to differentiate it from SRS.

The same RFC2822 verification problem exists for SPF and SES.  While the
most common case is for a single From: or Reply-To: addresses to be
different from the Return-Path:, I'm afraid that the general RFC2822
originator address problem is worse than that.  The "originator fields" in
RFC2822 are From:, Sender: and Reply-To:.  To really guard against forgery,
we need to protect all three of these fields in their most general form in
addition to the return-path.  This means From: with multiple addresses,
Sender: with an address not in From: and Reply-To: with multiple addresses
that may not be included in From:, Sender: or Return-Path:.

I don't think it's a good idea to simply flag a message as suspect if any of
the RFC2822 originator fields differ from the return-path, as there are
legitimate reasons for this to occur.  It's a very real problem that many,
if not most, ISP's in the U.S. don't support SMTP AUTH and are in no hurry
to do so.  Therefore our ability to always send out "first class mail" is
somewhat limited.  The attitude that "it's too bad for you" if you don't
have access to SMTP AUTH is not particularly helpful in solving this
problem.  I think a better solution would be to actually verify all four
originator fields, Return-Path:, From:, Sender: and Reply-To: and solve the
forgery problem unambiguously.  Fortunately, this is within reach.

Let's consider what's involved in protecting the RFC2822 originator fields
when each field is a single address that may or may not be the same as the
return-path and then attempt to generalize the solution from there.  The
default case should be the most common, that is, From:, Reply-To: and
Return-Path: are the same single address.  That's the case that Meng brought
up and the solution is easy.  Here are three cases where a single-address
RFC2822 originator field is legitimately different from the RFC2821
Return-Path:

1) From: and Return-Path: are different due to sending mail from an outside
service that is listed in the SPF record.

2) Reply-To: and Return-Path: are different because you want the reply to go
to another person.  For example, you suggest a meeting schedule and set
replies to your assistant to coordinate everything.

3) Sender: and Return-Path: are different since one person authored the
message and someone else distributed it from an outside service that is
listed in the SPF record.  In this case, From: would also be different from
Return-Path:, but we can ignore that for the moment.

Here is how we could handle the default case and these other three cases
with Signed Envelope Sender:


default case - From: = Reply-To: = Return-Path:
-----------------------------------------------

Here are two possible forms for MAIL FROM: for this case.

*******

SES0=HHHH=TT=local_part(_at_)domain

This format is the simplest SES implementation.  All local outgoing messages
have a MAIL FROM: in this format.  It is strongly recommended that the hash
secret is different for each user and could simply be their password.  The
return path is not be rewritten during transit, so it doesn't require
changes at forwarders and works today.  However, this address can survive
rewriting via SRS at forwarders and still allow the verification at the
recipient.  Any message recipient or forwarder can do a CBV to the MX for
"domain" to verify that the user does indeed have the right to use
"local_part(_at_)domain" as an originating address, in this case, the
return-path.  No keys in PKI, no data required in DNS, it just works for any
sending or receiving MTA's that implement it.

In this form, we _could_ require that From: and Reply-To: match Return-Path:
and we then have complete forgery protection with validation to the user
level _before_ DATA.  If after receiving the full message, the MTA detects
that any of From:, Sender: or Reply-To: do not match Return-Path:, the MTA
bounces the message to the return-path.  Since we have already done a CBV to
the return-path, we know that the bounce will be accepted.  Similarly, the
originating MSA should reject the message if the return-path is in this
format and any of From:, Sender: or Reply-To: differ from the return-path
address.  Signing all outgoing messages with an address such as this allows
you to reject _all_ bogus bounces, before DATA, regardless of whether or not
they publish SPF records.

Many people do not like CBV's and some even regard them as a form of abuse.
However, there is a logical reason for putting the computational burden of
validating user credentials on the message sender:  it is a deterrent to
spamming since for the first time, it costs more to send a message (in terms
of CPU cycles) than to receive one.  An additional answer to this objection
is that a site that signs all its outgoing mail with an SES or SRS address
agrees to accept those callbacks in order to unambiguously validate its
outgoing mail as well as giving it the implicit right to do a CBV on any
mail it receives with an SES or SRS return-path.  This is a cooperative
model and it is in all parties direct interest to cooperate.  Any site that
abuses it use of CBV's to a particular MX could quickly find itself on a
local blacklist, so there are strong incentives to avoid abuse.

*******

SRS0=HHHH=TT=MTA_domain=local_part(_at_)domain

The second form is a variation for compatibility with SRS.  Otherwise, all
its properties are identical to the first form, except that it is longer.

*******

Both of these formats accomplish all the original SPF goals, plus more,
without requiring address rewriting.  Both will survive address rewriting by
SRS, should that occur.  The idea of verifying intermediate forwarders is
very clever, but it is an indirect method for preventing forgeries that is
not as reliable as this direct method.  You can have more confidence in the
result from a CBV to the MX for the user domain than by accepting the end
result of a string of assertions by SPF+SRS sites with varying local
policies and implementation quality.  With SPF+SRS, you have no assurance
that the message author had the right to use "local_part(_at_)domain" as an
originator address, only that the originating gateway MTA had permission to
send mail on behalf of "domain".  Nothing in SPF compels the originating
gateway MTA to check anything further and you have no assurance of the true
authorship of any message, unless they use additional protocols.

To prevent forgeries, we really want to know that the RFC2822 From: and
Reply-To: fields are accurate as well as the RFC2821 Return-Path:, including
all the local parts, not just the domains.  In the case that all four fields
are the same, the above scheme accomplishes the goal regardless of who
carried the message in transit.  It also avoids corrupting the meaning of
the RFC2821 return-path, which SPF does.  That corruption created the need
for SRS, which then broke forwarding and thus created a barrier to adoption.
In addition, the above scheme gives you immediate and complete protection
from forged DSN's, without waiting for domain owners to publish SPF records.


variant case 1 - From: != Return-Path:
--------------------------------------

SES0=HHHH=TT=F=HHHH=TT=from_domain=from_local_part=
    local_part(_at_)domain

This is an SES-only implementation.  The first hash protects the whole
address.  The "F" indicates that it contains validation information for a
From: address which is different from the return-path.  The hash and
timestamp that protect the From: address are provided by the MUA, since only
the user knows the hash secret (preferably their password) for the account
"from_local_part(_at_)from_domain".  The originating MSA should do a CBV to the
MX for "from_domain" to validate the user's right to use
"from_local_part(_at_)from_domain" as an originating address.  If it fails, the
originating MSA rejects the message.  Similarly, if the originating MSA sees
a From: address that differs from the return-path address and the
return-path does _not_ provide From: address validation information, the
originating MSA should also reject the message.

Any message recipient or forwarder can do a CBV to the MX for "domain" to
verify that the sender had the right to use "local_part(_at_)domain" as an
originating address.  Similarly, they can do a CBV to the MX for
"from_domain" to make sure that the user had the right to use
"from_local_part(_at_)from_domain" as an originating address.  If after 
receiving
the full message, an MTA detects that the From: address does not match the
From: address validated in the return-path, the MTA rejects the message at
the end of DATA, or if that fails, sends a DSN.  Since we have already done
a CBV to the return-path, we know the DSN will be accepted.


variant case 2 - Reply-To: != Return-Path:
------------------------------------------

SES0=HHHH=TT=R=HHHH=TT=from_domain=from_local_part=
    local_part(_at_)domain

This is an SES-only implementation.  The first hash protects the whole
address.  The "R" indicates that it contains validation information for a
Reply-To; address which is different from the return-path.  All the
discussion for variant case 1 above applies.


variant case 3 - Sender: != Return-Path:
----------------------------------------

SES0=HHHH=TT=S=HHHH=TT=from_domain=from_local_part=
    local_part(_at_)domain

This is an SES-only implementation.  The first hash protects the whole
address.  The "S" indicates that it contains validation information for a
Reply-To: address which is different from the return-path.  All the
discussion for variant case 1 above applies.



We can use any combination of these mechanisms to validate multiple
originator address discrepancies from Return-Path:, though the local-part of
the return-path will likely go over the 64-byte limit.  That is likely to
happen in SRS anyway, so mail systems will have to be able to handle this
one way or another.  If we do this, the MUA will _never_ see a forgery,
since the MTA will reject it, so the MUA won't need the extra functionality
that Meng suggested.

People might begin to consider that the complicated and hard-to-sell SPF+SRS
does not buy them anything over this simple scheme.  In fact, this scheme
provides better originating address authentication and complete protection
from bogus DSN's.  It is difficult, if not impossible, to come up with a way
to verify all these addresses within the SPF+SRS framework.  Since the above
scheme does everything that SPF does without incurring the breakage of SRS,
I propose that people consider one of these two simple schemes as a suitable
replacement for SPF+SRS, with the pure SES scheme having the advantage of
simplicity and shortest address.

I should also point out that David Woodhouse has advocated using
private/public key authentication to protect the RFC2822 addresses in
addition to using an SRS0 signature on all local outgoing mail.  It
therefore has all the other good properties of an SES scheme and does not
depend on SPF to function.  His scheme puts a burden on the receiving MTA to
do the cryptographic validation of the RFC2822 fields, but it is otherwise
functionally equivalent and it has an advantage in keeping the return-path
shorter.  Since no one can fully validate _any_ mail header before receiving
the actual header, if there is a problem in one of those headers you can
either reject at the end of DATA, which doesn't always work, or send a DSN
in case the end-of-DATA rejection fails.  Fortunately, if the return-path is
SES or SRS signed and we did a CBV, we are assured that the return-path will
accept a DSN, so this poses no real problem.

--

Seth Goodman