PRA algorithm and use of non-standard header fields



I realize that new docs are in preparation, but I'd like to raise this
issue anyway...

A few people (myself included) have raised concerns about the
dependence of the PRA algorithm on non-standard header
fields.

Having thought about this a bit more, I'd like to air my specific
concerns about the PRA algorithm as defined in -marid-core-00 and -01

1) I really don't like the idea that the semantics of a standard IETF
   protocol will depend on the presence and contents of non-standard
   fields that are defined nowhere (other than in specific MTA
   documentation).

2) Because the names of these fields are fairly obvious and natural
   names, and not obviously MTA-specific, it seems entirely plausible
   to me that some lesser known or custom mail systems may use these
   fields in a way that is somewhat different to the ways in which the
   designers of the algorithm invisaged.

2) I liked the original Caller ID PRA algorithm; it was intuitive.
   Ignoring corner cases of ill-formed messages for a moment, it
   corresponds *exactly* to what I have always done to determine who
   sent me a message (ie the person who pushed the button on the MUA,
   or the mailing list that redistributed the message).  That is how I
   would read the headers, based on my understanding of RFC822 and
   RFC2822, to determine who or what sent me the message.

   Granted, MARID doesn't want to know who pushed the button, it wants
   to know who initiated the SMTP transaction, and forwarders make a
   difference there. But the Caller ID algorithm is one that everyone
   who knows how to read (2)822 headers is already familliar with; the
   revised algorithm in marid-core is not.

3) I don't think the currently proposed algorithm works.  Tony Finch
   raised a concern about the semantics of this some time ago, but I
   don't think this was ever followed up on.  I'd like to raise some
   more detailed concerns, having given the matter further thought.

   Consider these two scenarios:

   a) A message is sent, and it transits one or more MTAs that add
      Delivered-To.  The initial recipient then resends the message to
      a second recipient using their MUAs resend functionality.

      In this case step 1 of the PRA algorithm will correctly select
      the PRA from the Resent-* headers.

   b) Consider I (a(_at_)a(_dot_)com) have a message in my inbox.  For sake of
      argument, assume it currently contains no Resent-* headers or
      Delivered-To etc.

      Assume I now use it to resend the message to b(_at_)b(_dot_)com, and that
      account is configured to forward it to c(_at_)c(_dot_)com(_dot_)  Assume 
that the
      MTA at b.com adds

          Delivered-To: b(_at_)b(_dot_)com

      Assume the MTA at c.com wishes to perform MARID checks.

      In this case, the PRA algorithm will pick the pre-existing
      Resent-* header in preference to the more recent Delivered-To
      header that was added by b.com.  So c.com will incorrectly
      determine the PRA as a(_at_)a(_dot_)com instead of b(_at_)b(_dot_)com

   So the fact that some MTAs already add these headers when
   forwarding doesn't really help us, at least with the PRA algorithm
   in its current form.  To make things work correctly for forwarded
   messages that *already* contain Resent-* headers still requires
   modifications to forwarders.

   What you need to do to fix this is to take whichever Resent-* or
   Delivered-To header that was most recently added to the message;
   however I fear determining this reliably will involve making far
   too many assumptions about exactly how MTAs behave when adding
   these non-standard fields.

4) It complicates post-SMTP-time analysis tools.  It's my
   understanding that MTAs that use Delivered-To, etc will add these
   headers at final delivery as well as when forwarding.  Hence if a
   tool wants to determine the PRA of a message after delivery, and
   that tool is running on a site where the delivering MTA adds
   Delivered-To, then the tool needs to know to ignore the
   locally-added Delivered-To field when computing the PRA.

5) A corrollary of 4.  For PRA-based MARID to be useful to the masses,
   we need to encourage MUA authors to display or highlight in some
   way the PRA to the end user, so that they know which ID has been
   authenticated without having to read and understand the 2822
   headers in their entirity.

   So the MUA needs to run the PRA algorithm.  As with 4 above,
   dealing with locally-added Delivered-To fields will complicate
   this process.

So I'd like to suggest that step 3 simply be dropped from the
marid-core-01 PRA algorithm.  IMHO step 3 turns a clean algorithm into
a messy heuristic, and in its current form has serious problems, not
all of which can easily be fixed.

              -roy