ietf-mxcomp
[Top] [All Lists]

Re: TECH OMISSION: Stronger checks against email forgery

2004-09-07 14:08:04

On Tue, 2004-09-07 at 12:48, Yakov Shafranovich wrote:
Douglas Otis wrote:
On Tue, 2004-09-07 at 09:38, Yakov Shafranovich wrote:
Tony Finch wrote:
On Tue, 7 Sep 2004, Yakov Shafranovich wrote:
Tony Finch wrote:
On Fri, 27 Aug 2004, Jim Lyon wrote:

I continue to disagree.  There are too many scenarios where the bounce
address is uncorrelated with the MTA that's delivering a message; this
means that any scheme that attempts to reject mail based on those two
inputs (bounce address and IP addr of sending MTA) will have too many
false rejections.

What makes the PRA different from the bounce address from this point of
view?

The PRA algorithm tries to guess the most recent "Sender" for the 
message -
i.e. the one that is being used for this SMTP hop. The bounce address on 
the
other hand originates from the original hop and stays that way throughout
multiple SMTP hops.

This is also true for the PRA, since no existing MTAs add Resent-From:
header fields when alias-forwarding.

I think we are going in circles. Validating the bounce address only 
prevents false bounces but allows phishing, validating "Sender" does not 
stop phishing but allows bounces plus does not take into account 
forwarding, validating the "from" headers stops phishing but allows 
bounces. Which one of these do we really want?

Sender-ID is selecting a Mailbox Domain from a series of headers
considered related to the most recent MTA outside of the recipients
administrative region.  Sender-ID header selection prioritizes RFC2822
From as _last_ in the selection process.  This selection process
attempts to reduce rejection rates, but is no better at stopping
phishing as a result.  At least using MAIL FROM, there will be a better
understanding which header is being inspected. 

Aren't we basically trying to deduce the "mailbox of the agent 
responsible for the actual transmission of the message". Basically you 
want to know who was the original agent or "Sender" was that put that 
email into the email system. To me it sounds like the original 
"purportable responsible address" is the same as guessing "the agent 
responsible for the actual transmission of the message"?

The recipient will be looking at the RFC 2822 From, where there is a new
expectation they will be seeing some other header being checked.  It
remains, Sender-ID is not an effective tool against phishing.  Sender-ID
is also not an effective tool for stopping other types of abuse.  To
abate abuse, a reliable reputation mechanism must be possible.  There is
no reliable name obtained with Sender-ID.

If so, RFC 2822, section 3.6.2, is very clear that the "Sender" field 
indicates that. That's step 3 in PRA. However, the same RFC states in 
section 3.6.2 "If the originator of the message can be indicated by a 
single mailbox and the author and transmitter are identical, the 
"Sender:" field SHOULD NOT be used". That's step 4 of PRA. The PRA draft 
states explicitly that those 2 steps are taken directly from RFC 2822 
(don't see any IPR).

The EHLO domain tracks the "sender" without the need to sort headers. 
Make this field a visible part of the sender identification.   It does
not require any IPR either and authentication could always result in a
high degree of certainty.

Now for the resent blocks, the same RFC in section 3.6.6 states that the 
"Resent-XXX" fields are simply placeholders for previous versions of the 
same "Sender" and "From" fields. Therefore, there is a preference for 
"Resent-Sender" just like there is a preference for "Sender" described 
above. That's why steps 1-2 come before 3-4, and step 1 comes before 
step 2 (still no IPR).

What is the substantial difference between these headers and the name
contained within the EHLO domain?  There a tremendous advantage using
the MTA name to establish MTA to Mailbox Domain relationships for either
MAIL FROM or From.  Why care about these other headers?

Now the interesting part here is deciding how to pick the proper 
"Resent" header. The PRA algorithm picks the first "Resent-Sender" 
unless there is a "Resent-From" before with "Return-Path" or "Received" 
in between. The RFC states that:
A. "Resent fields SHOULD be added to any message that is reintroduced by 
  a user into the transport system."
B. "A separate set of resent fields SHOULD be added each time this is done."
C. "All of the resent fields corresponding to a particular resending of 
the message SHOULD be together."
D. "Each new set of resent fields is prepended to the message; that is, 
the most recent set of resent fields appear earlier in the message."

Picking the earliest Resent block is based on #D and B, picking an 
earlier "Resent-From" if separated from the other Resent headers is 
based on #C (and I still see no IPR).

Making a Mailbox Domain list that nominally relates to the MTA name will
encompass a vast majority of the mail without touching upon issues of
whether the Sender-ID IPR is obvious.  By making the MTA authentication
a separate function, then simply marking mail as being outside this MTA
name/Mailbox Domain relationship is little different than selecting an
invisible header compared against an address list unrelated to the From
or MAIL FROM.  Just use the EHLO domain.  By always exposing the EHLO
domain as visible, the recipient will soon recognize these From/EHLO
names used in tandem.  It is too dangerous to block mail based upon
Sender-ID.  A reputation check based upon the MTA name would allow a
safe method of blocking mail however.

If the goal is to assist the recipient decide what is valid mail and
what is not, the visible MTA name compared against a nominal or
restrictive list of MAIL FROM or From Mailbox Domains provides a
superior outcome.

-Doug