Re: [long] commnets on draft-crocker-email-arch-01


Tony,

Sorry it's taken so long for me to respond.  Many thanks for your thoughtful 
comments.  

FOLKS -- 

        I'm doing another pass over the document.  I'm hoping this is the last 
before seeking to publish it, preferably with some IETF status, not just as an 
Informational RFC.  So, please think about making comments.

        An html copy of the latest (unreleased) draft is at:  

<http://brandenburg.com/specifications/draft-crocker-mail-arch-01.html>

  Roles:
  I'm not sure about the "Provider" concept. This is partly because I
  don't like the term (it's too generic), partly because I'm not sure
  it's useful - it isn't used elsewhere in the document, though the term
  "organization" is.


I do not know what to add, about them, in an architecture document.  Still, 
providers have become a pretty distinct focus of attention for the anti-spam 
discussions, so I think they need to be cited in this sort of document.  

I'll take them out of the diagram, because it makes things confusing, but will 
leave a descriptive paragraph in.

  Originator/Recipient: Why not use the term "Author" instead of
  "Originator"?


legacy term.  orig/recip are from the X.400 days and have a pretty high 
installed base of usage.  I do not see our community having a dominant practise 
of using author -- although it does seem to have a high enough occurrence to 
cite the term also -- and this document is not trying to establish new practise.

It should be noted that a message may have multiple
  authors and recipients.


yeah.

  Relay: Is it worth breaking down this category further, ...
These distinctions are
  useful for talking about different kinds of access control



hmm.  maybe the way to resolve this and your concern over the reference to 
providers is to strengthen the distinction between the functional architecture 
(which is what this section is really about) from the 'operational' 
architecture.  So I'm inclined to enhance that latter section, as you cite, 
making the construct of an operational architecture distinct and significant.

  Mailbox addresses:

  Is it worth adding examples?


always.

  Message IDs:

  It might be worth explicitly mentioning that messages with the same
  message ID can be assumed to be the same, e.g. for the purposes of
  reducing the amount of space required by a message store.


and detecting duplicates.

  I'm not sure what the following sentence refers to: "Although Internet
  mail standards provide for a single identifier, more than one is
  sometimes assigned."


As a message travels along its path, different handlers can add  additional 
message identifiers.

  Identity Reference Convention:

  Why MailFrom but Rcpt-To?


"A foolish consistency is the hobgoblin of little minds"

(which is my way of trying to dodge the reality that I'm lousy at being 
consistent.)

I'm inclined to prefer no dash, since folks might think the dash is required.

  Email System Architecture:

  The overview diagram is very unclear,


yeah.  i need to rework it a bit.

  Should there be a concept of privilege? I.e. MUAs are not privileged,
  but everything else is. This is relevant to MSA fix-ups and alias
  handling.


I'm going to add another diagram at the beginning, distinguishing users from 
the Mail Handling Service (another x.400 term).  I think that's the place to 
introduce the distinction about privilege.

It is also the place to make clear that some users do varous forms of posting 
of new messages that derive from mail they have received...

  The comment about the BCC field is incorrect according to my
  experience: I think it's more usual for the message to be submitted
  once, and the MSA deletes the BCC header and transfers its address
  list to the RCPT TO list when constructing the envelope.


actual practise varies, but certainly this scenario needs to be cited.

  MSA:

  It should be noted that the submission protocol does not lie on
  the boundary between the MSA and MUA, because when it is being used
  the user's software has to do some MSA work like setting up the
  message envelope and handling the BCC: header field. An example of
  an implementation boundary which lies on the architectural boundary is
  the `sendmail -t` API in Unix.


I definitely am used to thinking of protocols as residing at boundaries, pretty 
much by definition, and I hope we are VERY careful about maintaining the 
distinction between the abstract architecture, versus concrete implementations.

That said, I think there is a philosophical debate one can have about where to 
draw lines on top of any functional series of 3 components (with the middle one 
being the protocol service).

  The MSA section should talk about the header fix-up work it should do,
  including adding missing fields (Date:, Message-ID:) and correcting
  the Sender: field (in the case that the MSA is more trusted than the
  oMUA and the MSA has authenticated the Submitter).


mumble.  i think it makes sense to have the formal architecture document say 
that the msa must enforce certain requirements.  as for any actual changes it 
makes, i think of that as a matter of private arrangement between the mua and 
the msa, since 'enforcement' can range from creating the fields, to fixing the 
fields, to rejecting them if the are not present and correct.

  Is the outgoing interface of the MSA required to be SMTP or can it be
  a local hook into the first MTA? For example, in the case of `sendmail
  -t` should the whole of sendmail be considered to be the MSA or is it
  a Siamese twin of MSA and MTA (I prefer the latter)?


If this is a pure Internet Mail architecture, then I think it must be submit or 
smtp coming into the msa and smtp going out.

that a particular bit of software can operate as either/both an msa or mta is 
probably not relevant to the architecture...

This relates to
  whether the MSA can be said to "set" a HELO identity (I would say
  not). Additional confusion is caused by the submission protocol, in
  which a HELO identity is "set" by the MUA part of the MSA.


Whatever entity is doing the client side of smtp (or submit) is the entity that 
sets the helo identity.

  One of the difficulties in the tables is that the HELO and Received:
  point in different directions, by which I mean that the Received:
  field refers to the message's previous hop whereas the HELO argument is
  related to its next hop.


The "BY" part of Received refers to the current hop, not the preceding one.  
For simple relaying the HELO argument refers to the same operational entity, 
although of course it might choose to use a different identifier.

  I also don't like referring to Received: as an identity because it is
  really a composite field containing a number of identities - at least
  the previous MTA's HELO, IP address etc. and this MTA's host name etc.
  It's too complicated to fit into the model comfortably.


fair point.  it suggests distinguishes identities from the fields they occur in.

  MDA:

  Nothing in this table should be there!

  POP and IMAP are not delivery protocols,


This is extremely irritating.  I thought I had fixed this section and now I 
have no idea what is going on.  Sorry for the confusion.

For the original version of the document, I had a dogfight with folks about 
this point.  There was strong consensus against my view.  Hence I meant to 
change this to show pop and imap strictly as retrieval, as you suggest.

For the original version of the document, I had a dogfight with folks about 
this point.  There was strong consensus against my view.  Hence I meant to 
change this to show pop and imap strictly as retrieval, as you suggest.

The only question this leaves is about choices for delivery protocols.  I think 
the answer is reduced to <local> or SMTP.

  Sieve should be mentioned as a means for controlling the MDA.


The diagram shows sieve as feeding into the MDA.

  Is it worth distinguishing online mode and local message store access?


although the operational difference is important, how does that affect the 
architecture?

  Sieve should be mentioned as a means for controlling the movement of
  messages in offline mode under control of the rMUA.


please elaborate.

  Message Data:

  I find it useful to distinguish between the "message envelope" (i.e.
  the sequence of SMTP commands before DATA) and the "message data"
  (which consists of everything after DATA, i.e. the "message header"
  and the "message body"). Before this document I haven't heard of
  anyone including trace fields in the envelope.


heh. heh.  I think I'm not the only one that has been doing it for many years.

But, yes, I know it is not the view of lots of folks.  It's probably worth 
having some discussion about this.

The reality of the trace fields is that they are strictly the result of the 
handling service.  They have nothing to do with the user-to-user message.  
That's the basis for drawing the line, in my view.  The fact that some handling 
information is passed in smtp commands and others are passed in rfc2822 headers 
is secondary, in my view.

comments?

  Two Levels of Store-And-Forward:

  Care is needed with the term "re-submission". It seems to imply that
  RFC 2476 fix-ups occur, which is not the case in alias handling. I
  still think alias handling is privileged, and architecturally an MDA
  function rather than an MUA function:


my own thinking has been undergoing some changes, given the recent discussions 
involving anti-spam work.  i'm inclined to agree with you.

what do other folks think?

  MTA Relaying:

  There should also be a section on MTA firewalling, to distinguish it
  from gatewaying.


I'm inclined to agree, but let's explore this a bit.  What is the architectural 
role of a firewall?  I understand it's operational import, of course, but how 
does this relate to an/the architecture?

  MUA Gateways:

  Aren't these really at the MTA level? Putting them at the MUA level
  implies a "final delivery" event before the message reaches the
  gateway.


Gateways are strange beasts, indeed.  My experience forces me to view them as 
hybrid MTA/MDA/MUA devices.  The basic point is that a gateway messes mightily 
with the content.  So it is not merely a variant of an MTA.  On the other hand, 
your  point about the possibility of deferring the formal delivery event to a 
point after the gateway is well taken.

I'm not sure how to model this.  The simplest thing I can think of is to define 
a gateway as a collection of modules and have notes that observe the 
pecularites.

  MUA Mailing Lists:

  The actor for the Sender should be the intermediate submitter not the


For reference, I'm going to do away with the concept of "intermediate".  The 
reason is that it encourages folks to miss that a mailing list is a whole new 
ballgame for the message, with a new submitter.  Mailing lists are one example 
of a class of "user-level multi-hop process" functions that are part of group 
activity, including such fun stuff as organizational flow mechanisms like 
purchase approval".  I think the architectural reference needs to encompass 
this range of functions, rather than being tailored to mailing lists.  We are 
having quite a bit of confusion because of a constrained view of mailing lists 
as merely being extensions to the underlying handling service.

  Whew. I think that's all.


good job.  thanks.  

now we get to debate...


d/
--
Dave Crocker
Brandenburg InternetWorking
+1.408.246.8253
dcrocker  a t ...
www.brandenburg.com