Re: more comments on draft-crocker-email-arch-00


On Thu, 17 Jun 2004, Dave Crocker wrote:


good comments.


Thanks for reading them. I generally agree with what you are saying, and
I'm really just quibbling about details and presentational matters.

About my layering idea. I described this mainly because it informs my
thinking about Internet email architecture and therefore I think it's
useful. In particular, the layer-e2/layer-e3 hop-by-hop/end-to-end
distinction which the MARID people fail to grasp. I agree with you that my
description of it contained some irrelevant and incorrect examples, but
these were for rationale purposes so can be excluded from a more focussed
document.

There is certainly a legitimate argument in favor of counting the body
and the RFC2822 headers together, since the meat of a message's
semantics well might be in some of those headers.

However the usual view is that the headers have meta-content, whereas
the body has content.  This distinction is exacerbated -- I use the
word since all of this is consistently confusing -- by the presence
of header information that is really part of the handling envelope
rather than the user payload.  sigh.


I don't think Internet email has clear layering at any level :-) I could
(should?) more explicitly divide my addressing layer into the envelope
addresses and the header addresses (2821 vs. 2822), and my content layer
into human-readable header content and body content (2047 vs. 2045).

TF> An odd result of the way I have divided the layers is that part of an
TF> addressing field is in the content layer (the display-name which may be
TF> MIME-encoded) and part is in the address layer (er, the address itself).

Not really.  The transport-level use does not have a display-name
field.  The RFC2822-level is really the end-user content information
(both display and address), rather than being use by transport.


I think I mis-explained this and we actually agree :-) My layering model
assigns the parts of the To/From etc. header fields that are covered by
RFC 2047 to the content layer, and the parts that RFC 2822 calls the
addr-spec to the address layer.

TF> A tempting comparison is between email address aliasing (as in the Sieve
TF> redirect action) and NAT.

Not really.  NAT is about content-rewriting.  Aliasing is about simple
message forwarding, without altering the content of the message.


Yes, this is why I said the analogy is bad. All analogies are bad, but
when used carefully they can be helpful.

TF>  Many people in the MARID camp claim that email
TF> aliasing is evil bad and wrong because it breaks the assumption that the
TF> SMTP originator information must correspond to the email address
TF> originator information.

I am not aware of that assumption ever being present in Internet mail.


Exactly. That is the MARID fallacy. (BTW I meant TCP client address and
HELO name when I said "SMTP originator information" -- I should have been
more clear.)

TF> Identities.

Having a discussion of identities, as separate from protocol layers,
was intentional. An address is the same, no matter the level in which
it is used.


Yes, but hostnames are not email addresses are not message IDs. I also
like being able to say to MARID people that assuming that a "from" email
address (whether 2821 or 2822) has some association with the SMTP client's
details is a layering violation.

TF> 3.1.1 / 3.1.2. Message submission.
TF> The draft states that the Sender: is set by the MUA.
TF> It is often overriden
TF> by the MSA to refer to the authenticated address of the sender, as a
TF> protection against spoofing.

This is a point worthy of some debate, I suspect.  Certainly an entity
later in a sequence can always override the work of an entity earlier
in the sequence.  Whether that is its architectural job is another
matter.


I assert that in the case of message submission this kind of fix-up is an
important part of the architecture, because it is one of the main
distinctions between the message submission protocol and SMTP. This is
what sections 4, 5, 6, and 8 of RFC 2476 are all about. Implementations
have not been very good at this distinction in the past but they are
getting better.

TF> 4.1. Envelope

TF> The description in the draft is rather different to the common meaning of
TF> "envelope". The word is usually used to mean the message transmission
TF> information that comes before DATA in the SMTP protocol.

I was rather pointedly trying to have the construct mean "handling
information" as it does more universally, rather than tieing it to
SMTP. As you note, the Received headers are an example. They are
strictly part of the MTA envelope (handling trace) world, even though
they are placed in an RFC2822 header.


I see from RFC 724 that your meaning of "envelope" goes back quite a long
way :-) However common usage when talking about Internet email is more
specific. If you want to avoid being tied to SMTP you can say that the
envelope is information used for handling the message which is not
contained in the message header or body.

TF>  Some of it may
TF> appear in the header (e.g. in the Received: trace if there's only one
TF> recipient, or in Return-Path: after final delivery), but that's an
TF> after-the-fact record of what was going on.

Received headers never appear in the SMTP protocol.


Yes (however they are defined in RFC 2821!) but I meant that they
contain a record of envelope information, e.g. my email address in:

Received: from joy.songbird.com ([208.184.79.7]:48673)
        by ppsw-0.csi.cam.ac.uk (mx.cam.ac.uk [131.111.8.140]:25)
        with esmtp (Exim 4.34) id 1BajjX-00054M-G8
        for dot(_at_)dotat(_dot_)at; Thu, 17 Jun 2004 00:22:19 +0100

TF> 5. Two levels of store-and-forward

The purpose of this section is to highlight the confusion that has
persisted and define a resolution to it. Over the years, I've tended
to be pretty loose about considering mailing lists to be part of the
transfer system, versus being a user-level process. Discussions this
year have eliminated my own sense of ambiguity about this. In
particular, discussions trying to resolve who the "responsible"
posting Sender is have made the architectural issue crystal clear to
me.

As for the number of levels, two is a good place to start. We can
always start a recursive sequence with the upper layer, should it
prove necessary. At the moment, I'm far more concerned about getting
us all to clearly distinguish between basic transfer to a listed
recipient, versus any higher-level process that might choose to
forward it on, on its own authority.


I generally agree. My main complaint is that the document assigns the
functions to either the MUA or the MTA, despite the fact that they are
frequently implemented by software that is acting in a different
architectural role. Yes I have read your note at the start of section 3
:-) but I think that if the architecture is going to be useful it should
be possible to relate it to an implementation without too many
contortions.

TF> This approach also escapes from the false MTA/MUA dichotomy.

and that false dichotomy would be what, exactly?


The division of activities between MTAs in section 5.1 and MUAs in section
5.2.

I think a more useful division is whether or not end-to-end responsibility
for the message has been assumed, which is indicated by changing the
reverse path, i.e. taking over interest in what errors may occur. (I'm
being explicit about end-to-end responsibility versus hop-by-hop
responsibility, because RFC 2821 talks a lot about responsibility of the
latter kind, whereas careful discussion of the former kind of
responsibility is relatively rare.) This distinction corresponds more
closely to my understanding of whether the action is implemented by an
MUA-line entity or an MTA-like entity. E.g. in the case of aliasing,
end-to-end responsibility is not taken, so my distinction puts it on the
MTA side of the line; If I were to assign aliasing to an architectural
component I'd give it to the MDA (c.f. section 5.2.5), and in practice the
MDA is often a function of the MTA.

TF> I think there are roughly two kinds of gateway, which should be kept
TF> more distinct in this document.
TF> Security gateways that do content filtering, but otherwise act like MTAs.

Those are called firewalls.  Real gateways are about translation.


OK. They're becoming so common that I think they are worth a mention. They
seemed to fit into the Gateway category because of the sentence "When it
connects environments that have technical similarity, but may have
significant administrative differences, it is easy to think that a gateway
is merely an MTA."

Either way I think both are variations on the theme of MTA not MUA.

TF> 5.2.3 Replying
TF> This section needs to mention the Re: convention for the Subject: and the
TF> propagation of the original Message-ID: into the new References: and
TF> In-Reply-To: fields.

By and large, I was not trying to document the myriad of common
practises outside the standards.  Instead the idea was to use the
stuff that has been formally documented as the basis for developing
architectural constructs.


But the handling of message IDs *is* part of the standards! Using them to
describe the threading of correspondence is an important idea in email.

At first I thought that Replying had no place in your document, and
especially not in a section that purports to talk about forwarding.
However when I realised that you had missed out message IDs from your list
of identities (when I wondered what identities belonged in layer e4), and
when I re-ordered your list of actions on a message, it made more sense.

TF> 2.2 Domain Names
TF> "sub-names" should be "labels" perhaps?

I don't understand.


The DNS RFCs refer to the compondents of domain names as labels. In the
email specifications they are syntactically atoms. I prefer to use
existing terminology where possible.

-- 
Tony Finch  <dot(_at_)dotat(_dot_)at>  http://dotat.at/