Re: The need for two headers


On Sat, Feb 07, 2004 at 09:48:31AM -0800, Paul Hoffman / IMC wrote:


You proposed splitting the RFC822 headers in two. We are not dealing 
with RFC822 headers. We are dealing with a completely new set of 
protocols.

If you feel that there are requirements for what are headers in 
RFC822, state the requirement.



So I obviously did not express my thoughts very well. 

I did not want to split the RFC822 header. I don't want to 
keep 821/822, except for learning from it's flaws.
I already wrote a draft with a protocol quite different from 
the old protocols, so I'm certainly not sticking to them.



Ok, I'll try to explain my intention on a different way:

As with transport mechanism, there will be a payload
(some kind of message) and some control information 
needed for the transport. Agreed so far?


Since we're talking about mail, we will still need to have some
information like with old snail mail paper letters, like Subject,
Date, Sender, Cc, or anything similar.  I guess any future mail
system will have to support a "mail header" and entries like
Date, Sender, Cc and such things. "Header" doesn't mean that
I'm talking exactly about RFC821/22 headers. "Header" means that
what people used to write on the head of a sheet of paper at 
the old days before e-mail was invented. 

Now you could complain that this is a matter of the content, and
you want to focus on the transport mechanism only. But that's
exactly what I'm talking about.

The old 821/22 e-mail system was interfering with the mail header 
and even modifying the body. There were Reveived-Lines added 
to the header, From- and To- lines were rewritten, Date fields 
were translated to different time zones and more of that. 
Some MTAs even change the character encoding of the body and the 
subject line. English speaking people are usually not aware of
this, but as a german I had lots of trouble with all those 
translations of character sets in the header or body. 

That's bad. 

I plead to keep the header belonging to the message (that what now
is 822) strictly separate from those data inserted or changed
by the transport mechanism. There should be an inner header, 
belonging to the message. The mail transport mechanism should 
keep fingers away and leave it completely untouched for several 
good reasons.

But a new transport mechanism will certainly still have to do 
some address rewriting and inserting whatever will be the
next generation of Received lines. Anybody want to deny this?

What's the consequence?

There need to be two distinct headers. One belonging to the 
message (where we don't really need to focus on), and another
one belonging to the transport (addresses, history,...).
Maybe it would be wise to drop the word "header" here, but
in contrast to today's mail system, we need two completely
distinct pockets of information. As an analogy to paper 
mail we could call the transport header "envelope". The snail 
mail is printing informtion on the envelope, but never modifies
the header on the letter inside.

How exactly the inner header and the envelope look like is not
important at the moment. Whether the 822 header format,
ASN.1, XML, is not to be discussed yet.

That's why a message _must_ consist of at least three parts:

- a transport header 
- a message header
- a message body (or some container like mime)


If we do so, we get an advantage for free:

Until now I had severe trouble in precisely identifying messages
in mailboxes, since the MessageID today is not necessarily unique. 
Calculating a message digest is difficult, since mails look different
if they had taken different ways. Actually, I had distinct
messages with the same MessageID, and I had copies of the same 
message with different Hash Values. That's a flaw to be solved in 
mail-ng.

But if we agree to have an inner payload (message header, body,...)
which must not be changed by the transport mechanisms, then 
we can reliably identify a message with it's hash value. If two
MTAs have the same message, they will calculate the same hash. 
And two different messages can't have the same hash. This is
required to protect against certain kinds of bugs and attacks
and to make transport reliable.

Once you have such a unique MessageID, your transport mechanism 
becomes much more powerful. You can have redundant mail relays
which automatically deny message they had already received from 
somewhere else. Synchronizing mailboxes on several computers becomes
very easy now. It's like mixing in the NNTP's IHAVE mechanism.

Once you have such a mechanism, you can build up a completely 
new mail distribution scheme with redundancy and other such 
candys. Or you could prove that a particular message has or has
not gone through a relay by just looking in the log files.

A message now becomes an object reliably and unambigously identified
by it's MessageID, which becomes a handle. This requires a message
header which will never be changed, no matter whether you forward a
message, post it on a mailing list or to a newsgroup (means the 
analogies in a future protocol). 


(Another wish is the selective transport, synchronization, and 
automatic folder sorting, which requires the message tag I've proposed
earlier, but I'm not yet sure in which header this belongs.)



regards
Hadmut