"Mark" == Mark Lentczner <markl(_at_)glyphic(_dot_)com> writes:
Mark> My stats also collected how often each step was used in
Mark> yielding the result. Out of over 7,000 messages, only 8
Mark> ever used step 2 (Resent-From:) and none used step 1
Mark> (Resent-Sender:). Perhaps these aren't important and only
Mark> complicate the issue. Without them, the "Sender if it is
Mark> there, otherwise From" logic is just the 2822 definition of
Mark> who the injector is.
I disagree. The PRA algorithm, in its current form, is a perfect
description of the 2822 semantics.
Surely if the message contains a Resent-From (and no Resent-Sender),
then according to 2822 the Resent-From is the person who sent you this
message?
And if it contains Resent-From and Resent-Sender, then the
Resent-Sender is the person who sent you the message on behalf of the
Resent-From.
Personally I think this analysis of statistics is useful, but possibly
misguided in intent (or at least misconstrued in intent by some
participants).
I don't see the PRA algorithm as some heuristic to try and construct
some new kind of identity from the headers; the PRA is a natural
concept that falls out of the headers defined by 822. If you want to
know who (according to the headers) sent you a message, would you do
anything substantially different from what the PRA algorithm
specifies?
If we're going to use a header identity rather than the envelope
identity, then the PRA is the only obvious identity to use; I've
believed that since before Caller ID was proposed, and I have yet to
see anyone make a convincing argument for any other header identity.
An IP-based authentication scheme obviously needs to look at who
actually sent the message, not on whose behalf it was sent (that
implies using Sender in preference to From). And it needs to look at
the immediate sender, which implies using the Resent headers in the
case of a resent message. And using the most recent resent headers in
the case of a multiply-resent message (which is permitted by 2822).
If we're going to spend time on the PRA, I think we should be
analysing it semantically, not statistically. We should be looking at
the details and the corner cases.
How does the current algorithm handle borderline messages (ie
technically malformed, but acceptable to a liberal parser)? Are there
any important corner cases that occur in the wild due to non-compliant
systems?
What are the issues with header order and the algorithm for selecting
the Resent header? (822 restricted messages to a single set of Resent
headers, but places no restrictions on placement. 2822 contrains
placement and allows multiple resending, but qualifies this with a
confusing SHOULD in the description of Resent headers which seems to
be at odds with the BNF.)
We know the broad outline of what is going to go to last call. Unless
anyone thinks that they can gain concensus for a significant change of
tack, lets focus on the little details.
I certainly wasn't expecting my proposal for revising the PRA
algorithm to be adopted with absolutely no discussion or even comment.
It was the product of several minutes of consideration on my part, and
was intended to be a starting point for discussion, not a fait
accomplis. I think it works, and the -core authors have done a decent
job of restating it more succinctly, but have we got the semantics
exactly right?
I _think_ so (besides my doubts about retain the "non-empty" check)
but I don't have confidence that anyone other than myself and the ID
authors have spent any time looking at it with a critical eye.
This is really a general complaint about the way this WG has operated;
there has been _far_ too little technical discussion of the details of
the proposals that are supposed to be going to last call in a very
small number of weeks. Lots of discussion of the big picture, at
times verging on out-of-scope, but little discussion of the actual
documents that we are supposed to be advancing...
(This complaint is in no way directed at Mark or any other specific
contributors to this thread, but to the WG as a whole.)
-roy