ietf-smtp
[Top] [All Lists]

Re: Final draft: draft-crocker-email-arch

2007-05-12 10:51:20

Eric,

Thanks for the detailed comments. After two years of iterating on detailed community review, it's interesting that new readers continue to raise new issues. If anything, this underscores the need for a common architectural reference, to try to getter better community alignment. It also suggests that the document could iterate forever...


Eric Allman wrote:
I've get a few comments, with one big meta-comment. I really don't think this is a Standards Track document. The language is too loose to be ST, and it doesn't really prescribe very much (there are only six MUSTs and one MAY in the whole thing). It's primary benefit seems to be to define language and give an intro to how the pieces fit together. If
the document is to be Informational then I feel much much better about it.

1.  Most/all IETF architecture documents are standards track (or BCP).

2.  Since the entire purpose of this exercise has been to provide a common
referential base, it doesn't do much good to have it issued without the explicit stamp of formal consensus approval. For Informational, the document could merely have been sent in to the RFC Editor directly, and neither I nor anyone else would have had to deal with 2 years of iterative review to make it work towards a consensus view. The community purpose of the document is served better if the document is assessed as representing a community view. That's what standards track (and BCP) accomplish.

3. An artifact of documenting what already is -- especially when what is is rather messy -- is that it says "is" and "does" and the like, rather than claiming to be directive. The reality of Internet Mail is that it is filled with variable behavior; so it's not clear that musts and must nots are all that appropriate, if the document is to be aligned with reality.

That said, if folks believe that the benefits of having an architecture document for email require that it be laced with the vocabulary of musts and must nots, they should speak up. Massaging the text into that form would be a hassle, and more importantly, it well could be counter-productive, but as always, consensus rules.

    Perhaps you can clarify why you see this linguistic point as being 
important?

4.  As for 'loose', I have previously heard concerns for the document's having
too much detail, not too little.  So I'm not sure how to interpret your
assessment.

5. I think you and I see the potential utility of the document rather differently. Given the document's history, as well as its exploration of concepts such as actors, trust, responsibility and interactions, I'd class it as more than simply an intro to how things fit together. My own view is that it can be quite helpful for future work, not merely for explaining past work. Perhaps you could clarify what functional aspects of the document are needed, to satisfy your own criteria?


In section 3.3.1 you say that Message-Id "is associated with the RFC2822.From field." In fact that is not a requirement. The Message-Id has to be unique but there is no requirement that the RHS be the same as the RHS of From, nor even any requirement that it be an actual host or domain name. In hosted mail in particular it may quite commonly be different.

It appears that you believe that "associated with" carries a requirement for
having the same domain name. That certainly was not intended (or it would have said that.) At the least, I think folks are pretty clear that there is no such requirement for the RFC2821.MailFrom address, even though it is "associated with" the RFC2821.Sender actor. So I don't see whya domain name relationship should apply for other "associated" relationships.

RFC2822 says:
The "Message-ID:" field provides a unique message identifier that
   refers to a particular version of a particular message.  The
   uniqueness of the message identifier is guaranteed by the host that
   generates it (see below)

My reading of this takes "version" to refer to a particular version of the content. The only identities in a message that can be characterized as plausibly responsible for the content are rfc2822.From and rfc2822.Sender.

As a rule, we hold the From field identity as responsible for content, with the Sender field as responsible for MHS interaction. That other agents can take action *of behalf of* the From does not make the content less
"associated with" that origination identity.

Confusion about this distinction between having primary responsibility, versus being a delegated actor, continues to plague email discussions. It was, in fact, one of the motivations for the document, and one of the reasons the document tries to note the links between fields.

All of this, by extension, makes the Message-ID value "associated with" the From value.

As with many Internet functions that facilitate human interactions, things
cannot be nearly as clear and precise as any of us would like.  That is why
section 3.3.1  on Message-ID rambles on at some length about the fuzziness.



You also say that more than one Message-Id is sometimes assigned. It may be that some broken messages have more than one, but 2822 allows only zero or one Message-Id field.

So it's probably good that the email-arch document that comments on the
"sometimes" begins with "Internet Mail standards provide for a single 
Message-ID"?

This is another example of the document's trying to be clear about the difference between formal and de facto behaviors. Would the document be better if it ignored the latter?


Some cases in your list of guidelines and examples that aren't covered are forwarding (new message-id), resending (same, but with a Resent-Message-Id), and digests (new).

I suspect there are quite a few cases not covered.  That's why the text says
"some".

If you feel it important to have these added, please explain the need.  Again
note that the pressure has been to find ways to make the document shorter, not
longer.


Figure 5 includes a whole bunch of acronyms that aren't defined until later, such as oMS, hMSA, rMDA, etc. I found I really couldn't understand the figure until several pages later (explanations start in section 4.2.1).

You seem to be saying that a picture of the pieces for a system should not be
presented to the reader until all of the pieces have first been described?

This could be an example of the basic tension between bottom-up versus
top-down pedagogy.

I prefer to introduce the big picture, so that people can see where each piece
fits, as it is discussed.  The danger with bottom-up pedagogy is that nothing
hangs together until the end.  The reader has to juggle all those bits,
without a framework, until the denouement, at that end.


Table 1, SMTP RcptTo, you say it's sent by the Originator. It can also be set by a Mediator.

Well, since a Mediator is a Recipient and an Originator, it can be argued that reference to it should appear anywhere either one of those terms appears. Somehow I do not think that would add clarity to the document.

So I chose to have the Table cite Mediator only for the identity fields that
are inherent to the Mediator role. By contast, Section 5 is focused on the Mediator role, so its discussion of fields cites "Mediator Originator" explicitly, for RcptTo.


In section 5.1 you say that aliasing SHOULD replace RFC2821.MailFrom. I think a lot of people would disagree with this, at least for some cases. For example, forwarding services such as pobox.com or acm.com or alumni.*.edu simply don't have mailboxes there to be responsible. You're also assuming that aliasing happens in the MDA "just before placing a message into the specified Recipient's mailbox." That's obviously not going to be true for forwarding services.

The two paragraphs that discuss Aliasing attempted to deal with exactly the
kinds of concerns you raise.  It well might be that the language simply does
not do the job intended for it.  Since 'mailbox' is already cited as not
necessarily being storage, the language "placed into" might be ill-advised.

But frankly, I was trying to avoid using the word "deliver" because that is a
formally defined construct, for a transfer of responsibility. Technically the message is already delivered when aliasing is performed. The text notes that it might not seem that way, given how the function is usually implemented, and the text then attempts to defend its view: The emssage has reached its designated RcptTO address. The new address is provided under the control of that original recipient. So it seemed simpler to avoid using the word "deliver" entirely, for this particular discussion.

Another possibility is that this represents a confusion between implementation and architecture. Your citing forwarding services suggests this rather strongly, since they are a particular operational implementation, rather than a distinctive architecture. That is, forwarding services are set up for the purpose of re-directing a message, to a different mailbox than the one specified by the Originator. This is an implementation choice, but does not -- or at least should not -- alter the architecture-based nature of what tasks are being done and where the responsibilities lie.

A forwarding service sends a message with a brand-new RCPT-TO.  This is
strictly under the control of the Recipient, rather than the Originator, or any of the MHS components. That's a rather basic distinction, in terms of where to assign responsibility. And the forwarding service is provided merely as an agent of the Recipient, since it is the Recipient who specifies what final address to use.

I see it is important -- in fact, inherent in the nature of a legitimate architecture -- that a single Internet Mail architecture cover a variety of significantly different operational implementations, including forwarding. As much as the document tries to keep the distinction between implementation and architecture clear, confusion about it continues to plague discussions. I'm not sure how to deal with that any better; suggestions are more than welcome.

The document's discussion of the RFC2821.MailFrom address, when there is
aliasing, notes the choices between retaining the original one, versus having the Mediator assign a new one, and explains why one is better than the other,
albeit acknowledging that both occur in the wild.

It occurs to me that this is another good example of the potentially
counter-productive effect that would occur for an Internet Mail architecture
document to have too many MUSTs and SHOULDs.  It would be pretending that we
can get reasonably strict conformance, when we have several decades of
experience clearly demonstrating that we can not.


If you're interested I did a graphic version of Figure 5 (I pretty much had to in order to understand it). I'll forward that to you if you're interested.

I had decided to wait on making the pretty graphics until the document was
published, for two reasons.  One is that things kept changing and I'd rather
work only on a single copy. The more important reason is that the text copy of an RFC is definitive, so that the ascii art has to be viable. Having a pretty graphics version will tend to bias people towards consulting it, rather than the ascii art version, and thereby make it likely that we will miss problems with the ascii art.

Hmmm.  Before I began that paragraph, I was going to say "sure, send it over"
but I think I just talked myself into wanting to wait until the document has
passed IETF Last Call...

d/
--

  Dave Crocker
  Brandenburg InternetWorking
  bbiw.net