Can we start from the other end ???



Hi,
 I've been watching this list of a while and have been trying to
 understand the issues.  And what I seem to be seeing is a lot of
 intelligent and experienced people getting nowhere (or at least, not
 very far).
 My experience suggests that when people fail to reach any
 understanding, it is often because they are asking the wrong
 question(s).
 So I would like to suggest a different question (or series of
 questions) be addressed.  Maybe we will find more clarity there.

 The question before the list is currently "what identity should we be
 authenticating?".   This is a question about the inputs to the
 process that is being developed.  I think it would be more fruitful
 to start by looking at the outputs, and making sure we agree on them.

 Once the desired outputs are clear and agreed, the necessary inputs
 and appropriate process might become more clear.
 
 So I will start out giving my understanding of the desired outputs,
 or goals.

 
 The over-all goal seems to be:
  
  A - reduce junk e-mail

 however that is much too broad for this list.  We are focussed on a
 specific part of that.  The particular issue that is being
 addressed is that it is currently too easy to forge identities.  We
 would like to know when an identity is authentic.

  B - add authentication to e-mail

 Note that there is no mention in this statement of reducing the
 amount of email, and I think that is significant.  There seems to be
 a lot of discussion on the list about how effective something can be
 at reducing the amount of Spam.  I think that is in-appropriate and
 should be ruled out-of-scope.
 What to do with an E-mail once we have assessed the authenticity of
 an item is a separate question.  I know people want to reduce the
 amount of Spam they get *now*, but I think we must make sure that
 that desire doesn't hurt the process of producing a good standard.


 However, even B is somewhat beyond the scope of this working group.
 The "RID" part of "MARID" makes it clear that we are primarily
 concerned with storing information in the DNS that can be used to add
 accountability to Email.

 The "MA" bit is, I think, unfortunate.  It has been said before that
 the word "Authorisation" should possible be "Authentication".  I
 would also suggest that it isn't primarily MTA's that we are
 interested in, but rather the mail itself. So MA should be
 Mail Authentication rather than MTA Authorisation.  Certainly an
 Authorised MTA can be expected to deliver Authentic Mail in certain
 circumstances, but restriction our attention to Authorising MTAs
 seems too restrictive.

 Anyway, we need to state a goal that reflects the storage of
 information in the DNS, and reflects the desire to see if mail is
 authentic.  It gets a bit wordy but:

  C - Specify how policy information can be stored in the DNS which
      allows the potential recipient of an Email to assess the
      authenticity of part or all of a mail message.

 This is, I think, beginning to get within scope for this working
 group.

 The question now becomes: what sort of authenticity information is
 useful to the receiver.
 This is, I think, what the Working Group's first question should be.
 Not "what identity do we check" but "what do we need to know".

 I think there are two particular pieces of information that are
 directly useful to the receiver.  I will note at the outset that I
 don't think either of these can be fully answered by Records In DNS.
 However in many cases they can, and in cases where they cannot, we
 can find other technologies to help with Authentication. 

 The two pieces of information are answers to the following two
 questions:

  D-1 - Where can I safely (i.e. without annoying an innocent party)
        send an automatic response to this message.
  D-2 - Which individual or agent chose to send this item to these
        recipients.

  D-1 is needed if any sort of automatic processing (with the possibility
  of rejection) is to happen after the SMTP session completes.  It is
  true that it is best to reject as much as possible during the SMTP
  session, but some checks cannot be done at that time.  For example,
  per-user white/black-lists or challenge-response may need access to
  per-user data that isn't available to the receiving MTA.

  D-2 is needed for the "user's experience" and for any sort of white list
  or black list.

  If D-2 is not reliably available, use "unknown" as the answer and
  proceed.  Local policy will decide how mail from "unknown" is dealt
  with, and what is displayed to the user.

  If D-1 is not reliably available, then it is inappropriate to send
  any automatic response to this message.
  I can see only three possible options if D-1 is not available:

    1/ violate relevant RFCs and simply not return any automatic
       response.
    2/ violate relevant RFCs and end the SMTP/DATA session with:
        599 - Mail conditionally accepted for delivery - see
                      http://what.ever/...
    3/ Send a reply if it is appropriate, and risk Spamming an
       innocent party.

  I would really like to go for "2", but I'm not game to try it until
  we have some sort of standards-track RFC in this area.  Until then,
  1 is probably best.

  If we agree that D-1 and D-2 are what we want, then we can start
  looking at how to get the answers.

  It must be remembered that answers might be available from other
  places than MARID (such as local whitelists for forwarders).
  However the process of trying to find answers from MARID would, I
  think, be 
     a/ enumerate possible candidates
     b/ check each candidate until a suitable one is found.

  The only possible candidate for D-1 would be the MAIL FROM address.
  The candidates for D-2 would be {Resent-,}{From,Sender}: and
  MAIL FROM. (But this can probably be debated).

  The checking process is clearly up for discussion, and I think the
  discussion would be useful if the goals were clear.

  Some statements that I think it would be useful to be able to store
  in the DNS are:

    E-1:  Any mail from domain X will originate from one of these IP
       addresses:  ......
       (This is essentially what most of SPF is about).
    E-2: You may safely send an automatic reply to any address at
       domain X that you find in a MAIL FROM: line, providing it comes
       from <>.
    E-3: This domain never sends any mail. (Note that this is stronger
       than E-1 with an empty IP address list).
    E-4: Mail that is from this domain is always signed using S-MIME,
       with a key that is available <a>here</a>.

   E-1 is essentially what SPF is trying to say.  Note that I say 
     "originate from", rather than "come from".  Just because it comes
     from somewhere else doesn't mean that it isn't authentic.
   E-2 may be stated by a domain that encapsulates and signs the
        MAIL FROM address in out going mail, and that drops any
        automatic replies that aren't correctly signed.  By making this
        statement, it can increase it's chance of getting all valid
        bounces.
   E-4 is, I believe, the best long-term answer to phishing.
      I can even be used to answer a stronger question than D-2?
       D-3: is there any solid reason to believe this email item is
             fraudulent. 


  Sorry this has been so long.  But now to the questions for the list.
  
   Would it be useful to clarify our goal first?
   Is "C" a reasonable statement?
   Are D-1, D-2, and D-3 what we want to know?
   Is there anything else that we want to know?

I think we need to agree on these questions before we delve too deeply
into mechanism and identity.

NeilBrown