ietf-mxcomp
[Top] [All Lists]

A proposal on identities

2004-04-18 21:57:46
Based on the discussion on this list over the last couple of weeks, I want 
propose a modified algorithm for selecting and validating email identities.   I 
want to explain the change and the rationale, and get your feedback.  This is a 
lengthy post; please bear with me.   
 
There are a couple of key issues that various members of this list have brought 
up that I think need to be addressed.  Forgive the lack of attributions here - 
you know who you are!  :-)
 
- even though I've babbled on ad nauseum about user experience and validating 
something the end user can see, there are cases in Caller Id where the From 
line never gets checked
- it's been brought to our attention that there are a small but significant 
number of mail list servers that do not insert a Sender header.  It's not clear 
how many sites use this, but it's another hurdle as mail from these list 
servers would appear to be spoofed.  All of them however do put the list owner 
address in the RFC2821 MAIL FROM.
- Caller Id proposes the use of Resent-* headers by forwarders.  However, there 
are some folks who have already implemented SRS or VERP, and others who believe 
that forwarders should indeed handle bounce messages.  While I have very deep 
concerns about SRS and VERP, if some organizations want to adopt either one, 
that should be their choice.  
 
To address these issue, I'd like to propose a revised algorithm for selecting 
and validating identities.  It's a two-step test.  
 
1.   Always first perform the spoof check on the RFC2822.From domain (i.e. look 
up the TXT records for this domain and verify the connecting IP address is on 
the list found there).  If this passes, we're done.  (If the RFC2822 From isn't 
a valid email address to begin with then we should probably reject the mail.)  
This is the normal case for most legitimate mail which travels one hop from 
source domain to destination domain.  Of course it still could be a spammer 
with a throwaway domain, but that'll get block listed pretty quickly by other 
means.  
 
At the same time we are looking up the list of authorized outbound MTA IP 
addresses for the domain, we can also look up other policy statements the 
domain may choose to make, for example 
- directOnly: this domain does not knowingly send mail via mailing list or 
forwarding services, further stringent checks of the sender may be required
- alwaysSigned: this domain always digitally signs outbound email  
- accreditedBy:  this domain's email behavior & polices are accredited by one 
or more named services  
 
2.  If #1 fails then it tells us there's a high likelihood the mail has 
traveled >1 hop.  In that case, select as the purported responsible domain 
(PRD) the first non-empty identity from this list: Resent-Sender, Resent-From, 
Sender, RFC2821.MAIL FROM.  Perform the spoof check.  If it passes we're good, 
otherwise, it's spoofed, or the domain hasn't published.  If the directOnly 
policy designation was made for the RFC2822 From domain in step #1, ensure the 
PRD is on the recipients' trusted senders list.  
 
Here too, we can also take note of policy declarations the PRD domain may have 
made.  One additional possibility relevant here is 
- noVisitors: this domain does not relay mail on behalf of any other domain
 
BTW, it's come to our attention that some  MTAs today insert additional headers 
in the message whenever they forward mail.  Postfix and qmail apparently insert 
a header called Delivered-To and exim inserts an Envelope-To header.  Although 
these headers are not defined in RFC2822, they seem to be in fairly widespread 
use.  We could consider adding them to the list of headers in step 2, after 
Resent-From and before Sender.  If we did this, then a number of well-known 
forwarders, including pobox.com I believe, would be compliant today, without 
the need to add Resent-* headers or implement SRS.  
 
That completes the description of the modified algorithm.
 
Advantages
- we always start by basing validation on a header the end user sees.  All 
other situations are special cases.  In fact, you can think of all the special 
cases in the current Caller ID spec as cases where the From domain is not the 
PRD.  This new algorithm cleanly separates these out into a second test.  
- list servers that don't insert Sender today but still have the list owner's 
address in the RFC2821 MAIL FROM are compliant without changes 
- Other checks of the RFC2821 MAIL FROM can still be performed (e.g. allow/deny 
lists) 
- Organizations that wish to implement SRS or VERP may do so
 
Disadvantages
- this may force us to do a 2nd spoof check in more cases than in the previous 
algorithm, but I think we can optimize this a little.  In Caller ID we're 
already having to do a 2nd lookup of the From domain's TXT records to check for 
the directOnly setting anyway
- a little more complexity in the algorithm. 
 
Answers to Some Anticipated Questions
 
Q1: Why not check RFC2821 MAIL FROM in #1 and leave the RFC2822 headers to step 
2?    
A: Because MAIL FROM by itself doesn't give us actionable information in most 
cases.  If the spoof check passes, it could still be a spammer who's registered 
their own throwaway domain but forged the RFC2822 From line.  We can't just 
accept the message without further spoof checks or we risk misleading the end 
user into thinking the From line has been validated.  If the spoof check of 
MAIL FROM fails, it could just be because a legitimate message has come through 
a forwarding service that hasn't implemented SRS.  In other words, no matter 
what result we get from checking the MAIL FROM, further testing of the RFC2822 
headers will *always* be required.  
 
Q2:  Why not put RFC2821 MAIL FROM as the first identity in step 2?  
A:  Because that would *force* adoption of SRS.  MAIL FROM is only empty in 
cases of bounce messages so we would only ever fall through and check the other 
headers on bounces.   As I said above, if organizations choose to implement 
SRS, that's their business, but it shouldn't be a requirement.  Or in RFC 
lingo, SRS can be a MAY, but not a MUST.   
 
Q3:  Spammers can still insert headers pointing to their own throwaway domains 
that will enable them to pass the spoof check, right?  
A:  True, but senders can place the directOnly flag on their EPD to indicate 
that a further validation against the recipient's safe list ought to be 
performed.  
 
Q4:  Why not include the RFC2821 HELO/EHLO domain in this algorithm?
A:  Because verifying HELO/EHLO, if it tells us anything at all, might tell us 
that the MTA is authorized to transmit messages.  However, it tells us nothing 
about whether that MTA is authorized to transmit on behalf of the *specific* 
domain responsible for a given message.  
 
OK, that's it.
 
Fire at will!  :-)
<Prev in Thread] Current Thread [Next in Thread>