The concepts of email address and mailbox


Hi!

Reading Dave Crocker's draft-crocker-email-arch-01 I started to think again
about the identification and differences between "email addresses" and
"mailboxes". In the draft, as in many other documents, both concepts are
essentially one, the email address is a "mailbox address".

I recognize the historical truth in this and the current email systems
have many vestiges of that school of thought, but nevertheless, I think it
is conceptually misleading. I am not sure where this discussion will lead
and whether this is important for the draft on email architecture, but
I'll send it to the list anyway and we'll see where it goes.

Historically email is send to a mailbox, either on the current machine
(so you only need the local_part) or to another host (then you have the
local_part(_at_)host address). There is a one-to-one relationship between
addresses and the mailboxes. Mailboxes were usually associated with
operating system users, so the local_part is really the username. So
far so good.

But already with the introduction of forwarding this changed a little bit.
If you have a .forward, mail for a user can be redirected somewhere else.
The distinct identity of the user is still here. His mailbox will always
be empty, but he has the .forward file. But once you move the forwarding
to the /etc/aliases file, you can create email addresses that belong to
no real mailbox and to no real entity at all.

So instead of thinking of email addresses and mailboxes as the same thing,
we should think of them as different entities. An "email address" is
something I can send an email to, no more, no less. A "mailbox" is
something used to store email, it can be associated with one or more
addresses, but it doesn't have to be.

Lets look at a few examples of email use with the two different concepts
in mind:

1. The outbox

Many MUAs store all outgoing email in an "outbox". Its a mailbox without
an address.

2. Sub mailboxes

With IMAP you can have "sub-mailboxes". They all have the same address,
but they also have a distinct name.

3. Addresses without mailboxes

There are many email addresses to which mail can be send that will never
turn up in a mailbox. Fax forwards come to mind or other systems where
the contents of the arriving email are used to start some kind of process.
Mailing lists always send the mail on, there is no mailbox involved (there
might be some kind of archive though).

4. Splitting email

Email to one address can be forwarded/aliased to several addresses, say
your home system and your laptop. Essentially email to one address gets
delivered to two mailboxes. With most current systems we have to give
those mailboxes distinct email addresses to allow this. But those email
addresses are never used outside your system, just internal ways of
naming. Wouldn't it be conceptually simpler to have a distinct namespace
for email addresses (which is global) and one for mailboxes (which is
internal to your organisation).

If you do the splitting before the mailbox, this would also get rid of
the ugly recursion problem in .forward files, where you put the name
of the mailbox itself in the .forward file to mean that you want an
email also delivered to the mailbox itself.

Concept 1:

  foo(_at_)example(_dot_)com ---> foo -+-> \foo
                            |
                            +-> bar(_at_)forward(_dot_)com

Conecpt 2:

  ADDR/foo(_at_)example(_dot_)com -+-> MAILBOX/foo 
                        |
                        +-> ADDR/bar(_at_)example(_dot_)com

(With the ADDR/ and MAILBOX/ I want to denote that there are two different
namespaces involved.)

5. Two addresses into one mailbox

If you have two email addresses and want email delivered into the same
mailbox you essentially have to have one mailbox and one forward:

  foo(_at_)example(_dot_)com -> mb
  bar(_at_)example(_dot_)com -> foo(_at_)example(_dot_)com

When you delete the foo address, you get a dangling link. But if you
have distinct email addresses and mailboxes this keeps working:

  ADDR/foo(_at_)example(_dot_)com -+
                        +-> MAILBOX/foo
  ADDR/bar(_at_)example(_dot_)com -+

Of course this only works inside one organisation.

5. Automatic classification of email

The spam problem has brought the automatic classification of email to
the forefront. Many people have not one but two mailboxes behind every
email address, the normal inbox and the spam box. Others have even more
sophisticated classification for different parts of there email. Work email
here, private email there, mailing lists traffic over there. So the
email address alone is not the only thing deciding into which mailbox
email gets delivered, there are many other attributes.

6. Virtual domains

The modern usage of bazillions of domains has led to mail systems which
are responsible for many domains at once. And of course the local parts
aren't distinct between all those domains. The old way of mapping the
local part to the name of a user or a mailbox doesn't work any more.
(and the way this is often done by mapping all virtual domains to one
real domain is a hack).


In many simple cases an email address and a mailbox will still be the
same and can be accessed under the same name. I am only concerned here
with the conceptual model not with the actual names used and the way the
software works.

The whole discussion might be a bit academic, altough I have build large
email systems around this concept which uses mailbox names totally distinct
from email addresses etc. and it works in real life, too.

But, more important, I think it might shed some light on the distribution
of work between the components of a mail system, i.e. the MTA, MDA and MUA.
In my view the MUA knows about email addresses only to send email. To read
email it knows about mailboxes (and, usually, the hosts where they reside).
It does not have to know about the email address used to send email to
those mailboxes. (And it doesn't actually know, because it doesn't see
the RFC2821.To address.) Similarly the POP and IMAP servers don't know
about email addresses, only about mailboxes.

It is the job of the MDA to do the actual transition between the "email
address space" and the "mailbox space". It translates email addresses
into mailbox names and delivers the mail.

To make this point more clear: There are two namespaces involved in email
today: One is the "email address" namespace. It consists of a domain,
resolved through DNS and the use of MX records and a locally unique
local part which should be transparent for the rest of the world.
Then there is the totally distinct "mailbox" namespace which consists
of a protocol (POP or IMAP), a hostname and a mailbox name. (And of course,
there are local mailboxes which is just a special case of a "local protocol".)

Delivery doesn't have to go into a mailbox. A fax forward for instance means
that the MDA will decide that a specific email address should be delivered
"to a telephone number" and will do this. But, again, there is this transition
between email address and another namespace. (By the way: using addresses
like +15551234567(_at_)phone(_dot_)example(_dot_)com for fax delivery is 
horrible. Addresses
like this mix up an already perfect and globally unique address, i.e. the
fax number, with the delivery mechanism through the gateway at
phone.example.com. Users don't want to and shouldn't have to know about the
@phone.example.com part, after all we got rid of bang paths a long time
ago.)

Note that in this concept an forward/alias from one email address to another
happens in the MTA not in the MDA.

Jochen
-- 
Jochen Topf  jochen(_at_)remote(_dot_)org  http://www.remote.org/jochen/  
+49-721-388298