Re: TECH OMISSION: Stronger checks against email forgery


On Tue, 2004-09-07 at 14:45, Yakov Shafranovich wrote:

Douglas Otis wrote:

Aren't we basically trying to deduce the "mailbox of the agent 
responsible for the actual transmission of the message". Basically you 
want to know who was the original agent or "Sender" was that put that 
email into the email system. To me it sounds like the original 
"purportable responsible address" is the same as guessing "the agent 
responsible for the actual transmission of the message"?


The recipient will be looking at the RFC 2822 From, where there is a new
expectation they will be seeing some other header being checked.  It
remains, Sender-ID is not an effective tool against phishing.  Sender-ID
is also not an effective tool for stopping other types of abuse.  To
abate abuse, a reliable reputation mechanism must be possible.  There is
no reliable name obtained with Sender-ID.


I think the intent was to introduce the entity of "original sender" of 
the message although I would have to agree with you that by itself it 
might not be too useful. Of course, considering that if Outlook 2000 and 
some other MUAs actually displays that header, it will have some use for 
those MUAs.

But, what about SUBMITTER or RFC2821 MAIL FROM bounce address? Purely 
from the technical viewpoint, are they reliable? All could be forged 
including EHLO.


It is much more difficult to forge the EHLO name.  Unlike Sender-ID
PRA/Submitter or RFC2821 MAIL FROM, there is no presumption of the
integrity of the mail channel for assessing a particular message based
upon the EHLO name.  The EHLO name, as a host name, must reference a
record that scales to the number of IP addresses of a host, unlike SPF
or Sender-ID that could encompass the entire Internet.  Unlike a record
intended for routing, this MTA record must ensure authentication is
possible.  This record also indicates the host is authorized by the
domain to send mail.

The certainty of the authenticated MTA name allows for asserting
reputation.  An error made establishing an accountable name could be
devastating for a reputation service.  Sender-ID and SPF simple leave
too many avenues open for this name to be spoofed.  In addition, unlike
Sender-ID or SPF, there is no presumption that the MAIL FROM or From
mailbox domain will be constrained.  MTA authentication operates
independently of the mailbox domain relationship.

If so, RFC 2822, section 3.6.2, is very clear that the "Sender" field 
indicates that. That's step 3 in PRA. However, the same RFC states in 
section 3.6.2 "If the originator of the message can be indicated by a 
single mailbox and the author and transmitter are identical, the 
"Sender:" field SHOULD NOT be used". That's step 4 of PRA. The PRA draft 
states explicitly that those 2 steps are taken directly from RFC 2822 
(don't see any IPR).


The EHLO domain tracks the "sender" without the need to sort headers. 
Make this field a visible part of the sender identification.   It does
not require any IPR either and authentication could always result in a
high degree of certainty.


The key difference between EHLO and "Sender" is that EHLO is 
MTA-specific while "Sender" is specific to the message itself.


Perhaps a poor choice of words.  The sender as identified using the PRA,
such as Resent-From, could easily result in any of the same names as
found in the EHLO name.  Neither offer a direct relationship to the
author of the message.  The MTA name verifies the administrative domain
accountable for delivered messages.  If the mailbox domain is then
related to MTA names, this arrives at the goal of mailbox domains
authorizing these MTA names.  It is also safe to handle this
relationship as being either nominal or absolute.  This MTA name
authorization can be done within a single DNS lookup without the need
for subsequent lookups.

A single EHLO value may cover messages originating from multiple domains.


There is no limitation on the EHLO domain with respect to the number of
domains claimed.  A service provider may elect to assert an EHLO name to
correspond to the customer being relayed.

Therefore using the "Sender" field or "SUBMITTER" allows for granularity 
where a specific MTA may be authorized to relay email for one domain and 
not another, like an ISP's outgoing MTA. Of course, like someone has 
mentioned before, there are less MTAs then domains, and it might be 
useful to work with the MTAs alone.

I agree that the reliability is much less for PRA then EHLO, but on the 
other hand SUBMITTER should be no worse than EHLO. So EHLO would not be 
an option, if the granularity described above is desired.


Submitter is based upon the PRA.  Submitter can be no better than the
PRA as a result.  Yes, there is much less reliability of the PRA as
compared to the EHLO domain.  There is also a greater likelihood there
will be fewer EHLO domains compared to mailbox domains.  A mail provider
that keeps track of their outbound SMTP logs and quickly terminates the
typical spammer will earn a good reputation and will be offering a
valuable service.  There is no need to restrict the RFC2822 From
address, if the provider authenticates their clients using any number of
methods.  How this control is achieved should be left to the mail
provider, and using the EHLO name affords this flexibility.

And I still argue that the entire PRA algorithm is obvious to anyone 
reading the RFC so the IPR issue is probably moot.


I see nothing to justify using the PRA algorithm.  These headers are of
no value when compared to the EHLO domain.

Now for the resent blocks, the same RFC in section 3.6.6 states that the 
"Resent-XXX" fields are simply placeholders for previous versions of the 
same "Sender" and "From" fields. Therefore, there is a preference for 
"Resent-Sender" just like there is a preference for "Sender" described 
above. That's why steps 1-2 come before 3-4, and step 1 comes before 
step 2 (still no IPR).



What is the substantial difference between these headers and the name
contained within the EHLO domain?  There a tremendous advantage using
the MTA name to establish MTA to Mailbox Domain relationships for either
MAIL FROM or From.  Why care about these other headers?


See above.


There is nothing that limits the granularity of the EHLO name, but there
are reasons for wanting fewer entities to assess reputations.  If the
EHLO name is compared against a name list referenced by the mailbox
domain, this arrives at the desired goal of assessing the mail channel
against the mailbox domains as a separate and optional operation.

Now the interesting part here is deciding how to pick the proper 
"Resent" header. The PRA algorithm picks the first "Resent-Sender" 
unless there is a "Resent-From" before with "Return-Path" or "Received" 
in between. The RFC states that:
A. "Resent fields SHOULD be added to any message that is reintroduced by 
 a user into the transport system."
B. "A separate set of resent fields SHOULD be added each time this is 
done."
C. "All of the resent fields corresponding to a particular resending of 
the message SHOULD be together."
D. "Each new set of resent fields is prepended to the message; that is, 
the most recent set of resent fields appear earlier in the message."

Picking the earliest Resent block is based on #D and B, picking an 
earlier "Resent-From" if separated from the other Resent headers is 
based on #C (and I still see no IPR).


Making a Mailbox Domain list that nominally relates to the MTA name will
encompass a vast majority of the mail without touching upon issues of
whether the Sender-ID IPR is obvious.  By making the MTA authentication
a separate function, then simply marking mail as being outside this MTA
name/Mailbox Domain relationship is little different than selecting an
invisible header compared against an address list unrelated to the From
or MAIL FROM.  Just use the EHLO domain.  By always exposing the EHLO
domain as visible, the recipient will soon recognize these From/EHLO
names used in tandem.  It is too dangerous to block mail based upon
Sender-ID.  A reputation check based upon the MTA name would allow a
safe method of blocking mail however.

If the goal is to assist the recipient decide what is valid mail and
what is not, the visible MTA name compared against a nominal or
restrictive list of MAIL FROM or From Mailbox Domains provides a
superior outcome.


I am not sure which would be superior. EHLO is not being used today 
anywhere, and similarly for phishing the 2821 MAIL FROM bounce address 
is not displayed to the end user. From the viewpoint of phishing, it 
makes perfect sense to work with the message itself, not the channel. I 
don't see how the invisible EHLO is any better than the invisible 
"Sender". If anything, SUBMITTER or 2821 MAIL FROM, or 2822 "from" would 
be superior.


The EHLO name is a stronger identity that can safely support reputation
assertions.  If there are any illegal activity, an authenticated MTA
name also identifies where logs can be found as evidence. 
Authenticating the MTA name allows an extremely simple name list to
express the relationship between the mailbox domain and the MTA name. 
This gets rid of the need for dozens or hundreds of DNS lookups and
complex macro scripts needed within SPF or Sender-ID.

These lists can look something like:

_mp._smtp.my-mailbox-domain.com. PTR  big-isp.com
                                      PTR  ads-r-us.com
                                      PTR  webs-r-us.com
                                      
Where a EHLO domain of mx01.nw.big-isp.com is recognized as sending mail
on behalf of my-mailbox-domain.com.  This relationship can be considered
"required" or "nominal" without creating any exploit risk as with either
SPF or Sender-ID.  If there is abuse found, the reputation would be
against the EHLO name and not the mailbox domain, as this mailbox
identity is simply too weak.  There are also advantages and greater
freedoms afforded by establishing a hierarchy of accountability.

-Doug