Tony,
Sorry it's taken so long for me to respond. Many thanks for your thoughtful
comments.
FOLKS --
I'm doing another pass over the document. I'm hoping this is the last
before seeking to publish it, preferably with some IETF status, not just as an
Informational RFC. So, please think about making comments.
An html copy of the latest (unreleased) draft is at:
<http://brandenburg.com/specifications/draft-crocker-mail-arch-01.html>
Roles:
I'm not sure about the "Provider" concept. This is partly because I
don't like the term (it's too generic), partly because I'm not sure
it's useful - it isn't used elsewhere in the document, though the term
"organization" is.
I do not know what to add, about them, in an architecture document. Still,
providers have become a pretty distinct focus of attention for the anti-spam
discussions, so I think they need to be cited in this sort of document.
I'll take them out of the diagram, because it makes things confusing, but will
leave a descriptive paragraph in.
Originator/Recipient: Why not use the term "Author" instead of
"Originator"?
legacy term. orig/recip are from the X.400 days and have a pretty high
installed base of usage. I do not see our community having a dominant practise
of using author -- although it does seem to have a high enough occurrence to
cite the term also -- and this document is not trying to establish new practise.
It should be noted that a message may have multiple
authors and recipients.
yeah.
Relay: Is it worth breaking down this category further, ...
These distinctions are
useful for talking about different kinds of access control
hmm. maybe the way to resolve this and your concern over the reference to
providers is to strengthen the distinction between the functional architecture
(which is what this section is really about) from the 'operational'
architecture. So I'm inclined to enhance that latter section, as you cite,
making the construct of an operational architecture distinct and significant.
Mailbox addresses:
Is it worth adding examples?
always.
Message IDs:
It might be worth explicitly mentioning that messages with the same
message ID can be assumed to be the same, e.g. for the purposes of
reducing the amount of space required by a message store.
and detecting duplicates.
I'm not sure what the following sentence refers to: "Although Internet
mail standards provide for a single identifier, more than one is
sometimes assigned."
As a message travels along its path, different handlers can add additional
message identifiers.
Identity Reference Convention:
Why MailFrom but Rcpt-To?
"A foolish consistency is the hobgoblin of little minds"
(which is my way of trying to dodge the reality that I'm lousy at being
consistent.)
I'm inclined to prefer no dash, since folks might think the dash is required.
Email System Architecture:
The overview diagram is very unclear,
yeah. i need to rework it a bit.
Should there be a concept of privilege? I.e. MUAs are not privileged,
but everything else is. This is relevant to MSA fix-ups and alias
handling.
I'm going to add another diagram at the beginning, distinguishing users from
the Mail Handling Service (another x.400 term). I think that's the place to
introduce the distinction about privilege.
It is also the place to make clear that some users do varous forms of posting
of new messages that derive from mail they have received...
The comment about the BCC field is incorrect according to my
experience: I think it's more usual for the message to be submitted
once, and the MSA deletes the BCC header and transfers its address
list to the RCPT TO list when constructing the envelope.
actual practise varies, but certainly this scenario needs to be cited.
MSA:
It should be noted that the submission protocol does not lie on
the boundary between the MSA and MUA, because when it is being used
the user's software has to do some MSA work like setting up the
message envelope and handling the BCC: header field. An example of
an implementation boundary which lies on the architectural boundary is
the `sendmail -t` API in Unix.
I definitely am used to thinking of protocols as residing at boundaries, pretty
much by definition, and I hope we are VERY careful about maintaining the
distinction between the abstract architecture, versus concrete implementations.
That said, I think there is a philosophical debate one can have about where to
draw lines on top of any functional series of 3 components (with the middle one
being the protocol service).
The MSA section should talk about the header fix-up work it should do,
including adding missing fields (Date:, Message-ID:) and correcting
the Sender: field (in the case that the MSA is more trusted than the
oMUA and the MSA has authenticated the Submitter).
mumble. i think it makes sense to have the formal architecture document say
that the msa must enforce certain requirements. as for any actual changes it
makes, i think of that as a matter of private arrangement between the mua and
the msa, since 'enforcement' can range from creating the fields, to fixing the
fields, to rejecting them if the are not present and correct.
Is the outgoing interface of the MSA required to be SMTP or can it be
a local hook into the first MTA? For example, in the case of `sendmail
-t` should the whole of sendmail be considered to be the MSA or is it
a Siamese twin of MSA and MTA (I prefer the latter)?
If this is a pure Internet Mail architecture, then I think it must be submit or
smtp coming into the msa and smtp going out.
that a particular bit of software can operate as either/both an msa or mta is
probably not relevant to the architecture...
This relates to
whether the MSA can be said to "set" a HELO identity (I would say
not). Additional confusion is caused by the submission protocol, in
which a HELO identity is "set" by the MUA part of the MSA.
Whatever entity is doing the client side of smtp (or submit) is the entity that
sets the helo identity.
One of the difficulties in the tables is that the HELO and Received:
point in different directions, by which I mean that the Received:
field refers to the message's previous hop whereas the HELO argument is
related to its next hop.
The "BY" part of Received refers to the current hop, not the preceding one.
For simple relaying the HELO argument refers to the same operational entity,
although of course it might choose to use a different identifier.
I also don't like referring to Received: as an identity because it is
really a composite field containing a number of identities - at least
the previous MTA's HELO, IP address etc. and this MTA's host name etc.
It's too complicated to fit into the model comfortably.
fair point. it suggests distinguishes identities from the fields they occur in.
MDA:
Nothing in this table should be there!
POP and IMAP are not delivery protocols,
This is extremely irritating. I thought I had fixed this section and now I
have no idea what is going on. Sorry for the confusion.
For the original version of the document, I had a dogfight with folks about
this point. There was strong consensus against my view. Hence I meant to
change this to show pop and imap strictly as retrieval, as you suggest.
For the original version of the document, I had a dogfight with folks about
this point. There was strong consensus against my view. Hence I meant to
change this to show pop and imap strictly as retrieval, as you suggest.
The only question this leaves is about choices for delivery protocols. I think
the answer is reduced to <local> or SMTP.
Sieve should be mentioned as a means for controlling the MDA.
The diagram shows sieve as feeding into the MDA.
Is it worth distinguishing online mode and local message store access?
although the operational difference is important, how does that affect the
architecture?
Sieve should be mentioned as a means for controlling the movement of
messages in offline mode under control of the rMUA.
please elaborate.
Message Data:
I find it useful to distinguish between the "message envelope" (i.e.
the sequence of SMTP commands before DATA) and the "message data"
(which consists of everything after DATA, i.e. the "message header"
and the "message body"). Before this document I haven't heard of
anyone including trace fields in the envelope.
heh. heh. I think I'm not the only one that has been doing it for many years.
But, yes, I know it is not the view of lots of folks. It's probably worth
having some discussion about this.
The reality of the trace fields is that they are strictly the result of the
handling service. They have nothing to do with the user-to-user message.
That's the basis for drawing the line, in my view. The fact that some handling
information is passed in smtp commands and others are passed in rfc2822 headers
is secondary, in my view.
comments?
Two Levels of Store-And-Forward:
Care is needed with the term "re-submission". It seems to imply that
RFC 2476 fix-ups occur, which is not the case in alias handling. I
still think alias handling is privileged, and architecturally an MDA
function rather than an MUA function:
my own thinking has been undergoing some changes, given the recent discussions
involving anti-spam work. i'm inclined to agree with you.
what do other folks think?
MTA Relaying:
There should also be a section on MTA firewalling, to distinguish it
from gatewaying.
I'm inclined to agree, but let's explore this a bit. What is the architectural
role of a firewall? I understand it's operational import, of course, but how
does this relate to an/the architecture?
MUA Gateways:
Aren't these really at the MTA level? Putting them at the MUA level
implies a "final delivery" event before the message reaches the
gateway.
Gateways are strange beasts, indeed. My experience forces me to view them as
hybrid MTA/MDA/MUA devices. The basic point is that a gateway messes mightily
with the content. So it is not merely a variant of an MTA. On the other hand,
your point about the possibility of deferring the formal delivery event to a
point after the gateway is well taken.
I'm not sure how to model this. The simplest thing I can think of is to define
a gateway as a collection of modules and have notes that observe the
pecularites.
MUA Mailing Lists:
The actor for the Sender should be the intermediate submitter not the
For reference, I'm going to do away with the concept of "intermediate". The
reason is that it encourages folks to miss that a mailing list is a whole new
ballgame for the message, with a new submitter. Mailing lists are one example
of a class of "user-level multi-hop process" functions that are part of group
activity, including such fun stuff as organizational flow mechanisms like
purchase approval". I think the architectural reference needs to encompass
this range of functions, rather than being tailored to mailing lists. We are
having quite a bit of confusion because of a constrained view of mailing lists
as merely being extensions to the underlying handling service.
Whew. I think that's all.
good job. thanks.
now we get to debate...
d/
--
Dave Crocker
Brandenburg InternetWorking
+1.408.246.8253
dcrocker a t ...
www.brandenburg.com