[Keith Moore <moore(_at_)cs(_dot_)utk(_dot_)edu> wrote:]
- Managing complexity
I believe this should be the #1 priority of any new design that we
approach from a "how would we do this if we didn't have an installed
base to consider" viewpoint.
What is the essence of "email", once all aspects that actually pertain
to other components of the overall system (the Internet as a whole,
plus the humans that use it) are accounted for and dispensed with?
Complexity that *cannot* be dispensed with includes how humans behave,
how their governments, corporations, organizations, behave, and so on.
[From a couple of other contributors:]
-- internationalization, esp. of addresses
Unless this is something that an email system must solve on its own,
punt it upward to the Internet. That might mean an "email address"
becomes redefined as something very simple, like a barely-structured
sequence of octets.
If the classic embedded "@" poses a problem, consider eliminating it
in favor of a more general solution that possibly accommodates it when
used for "old" email addresses.
Personally, I think "we" all went off track when we came up with
different ways to represent locators. There should be one general
means on the Internet to describe locations of entities, and it should
*always* denote membership in one direction: left-to-right, if we want
to stick with the predominant Western approach a la URLs, pathnames,
and such; or right-to-left, if we remain persuaded by the appeal of
domain names.
(URLs are screwed up in how they handle user names and passwords;
instead of "http://somebody:password(_at_)hostname/", which spammers,
vermin, and criminals exploit a la
"http://www.cnn.com:whatever(_at_)webhijackers(_dot_)mob", they should have
stuck
with the left-to-right progression, as in
"http://hostname~somebody:password/", or whatever would work
syntactically, since "somebody" is a member of "hostname", not the
other way 'round.)
If we use the pathname metaphor, and leave host names to the system in
which the new email system is instantiated, then email addresses come
to look something like:
example.com/postmaster
Note how easily that extends into the current concept of URLs:
email://example.com/postmaster
So, we leverage whatever "the Internet" does to handle
internationalization of host names, we don't introduce yet another
character (beyond the "/" already used in URLs) with which to cope,
and that's just for starters.
Consider mailers like qmail that offer multiple addresses under the
control of a user using a "-" as a separator, e.g.
"joe-sales(_at_)(_dot_)(_dot_)(_dot_)"
being handled by a file named ~joe/.qmail-sales,
"joe-family(_at_)(_dot_)(_dot_)(_dot_)" by
~joe/.qmail-family.
Well, that could just become:
example.com/joe/sales
example.com/joe/family
Nice, clean, simple, and it leverages URLs and related mechanisms.
-- enhanced tracing mechanisms
One *absolute requirement* of any new design, in my opinion, is for
every *automated* response to the submission of an email into the
system to represent an *equal or lesser* reaction than the initial
action. I call this "lightweight responses to arbitrary actions", and
it should permit aggregation of related responses.
This prevents the system as a whole (as well as individual parts) from
thrashing about wildly as it "ups the ante" on an email that, for
whatever reason, can't be delivered, triggers bugs, and so on.
Therefore, no equivalent to DSN/bounce in the new system can be
*allowed* to return the contents (body) of the original email, nor
even the headers; basically, the design must *rule out* any form of
notification that includes arbitrary amounts of data from the original
email in that notification.
Otherwise the notifications would inevitably become bigger and more
resource-intensive as they bounced back and forth on a misconfigured
system. This is already a huge problem, due to joe jobs and related
scourges.
(Consider a post office that can't deliver an envelope, due to a wrong
address or something. But Betty The Postal Worker, extra-helpful and
friendly, gives such envelopes special treatment, so she puts it into
a new, larger envelope, copies the recipient address to the sender
address and the sender to the recipient address on the new envelope,
puts new postage on it, and sends it along. That's wonderful for the
recipient -- the original sender -- when it arrives as planned. But
what happens if the original sender's address was wrong? Well, if
Betty is the only one who handles things this way, then the larger
envelope will get returned to the non-existent original recipient, and
the post office will presumably "drop" it a la a double bounce.
Worse, if that second post office has its own version of Betty, who
handles things the same way, then someday a huge truck will be needed
to deliver the ever-growing bundle bouncing back and forth between
post offices. Of course, since real mail contains "tangible"
materials that are presumed to not be easily reproducible by the
sender, those contents are usually returned; but that metaphor says
little about the requirements for email, especially since, in today's
real world, sometimes the sender would prefer an automated phone call
saying "your mail to <address> sent on <date> with <identification>
could not be delivered due to <reason>; it has been discarded",
because that'd arrive sooner than the returned packet with materials
they can just reprint using their own computer or copier and
retransmit correctly.)
Think instead in terms of an email's envelope having "room" for, say,
64 bytes of data and an IP/port combination to which a relay or
recipient sends, along with a tight encoding of the reason, a report
via UDP datagram, called a "DSN UDP".
In response to such a datagram, if it pertains to a legit message and
is considered "believable", an ACK is delivered back via UDP. If it's
reasonably believed to be "innocently" sent but is known to not
pertain to a legit message, a NAK is delivered back via UDP.
Until the ACK or NAK is received, the entity generating the DSN UDP
might choose to retry any number of times, or might just forget about
the whole thing.
But since the responsibility for delivering the *email* is the
*originating user* -- and the new design must stand firm on that -- no
requirement should be placed on receivers of messages to ensure that
their DSN UDPs are received.
(And let's dispense with the idea of providing for "guaranteed
delivery" of email to a destination; it's the responsibility of the
*sender* to verify receipt. If you want to make sure Joe M. Smith
gets your message, try email, and if you don't get some assurance he
received it, try certified mail or a process, as in subpoena, server.
Computers are very poor at being sure they're dealing with real, live
humans, never mind telling them apart; once they're good at doing
that, they won't need us to design low-level protocols for them.)
And, generally, the protocols should put the "burden of proof" and the
"burden of correct behavior" on entities that *inject* email into the
system, rather than the present approach of, e.g., claiming an SMTP
server is not allowed to drop a TCP connection on a client.
-- generalized challenge/response mechanisms
There are parts of this I think could belong in an email system,
because of the unique characteristics it offers. E.g. the ability for
a roaming user claiming a more permanent "home" to inject a message
into the system depends on the ability of the recipient of that
injection to query that "home" in a standard way in order to ask "do
you know this user, do you know about this message", without
necessarily having to depend on that third party in order to continue
message delivery.
What doesn't belong in email is validating the existence or presence
of human beings on either end of the communications medium, because
that is an issue for various other components of the Internet.
And there are other forms of "injection" that imply the utility of CR
mechanisms, e.g. ftp uploads, suggesting maybe email shouldn't try to
provide its own solution to that problem, beyond what is unique to
email as a communications medium.
-- transport-level authentication
Not sure how this is unique to email. Authentication is a general
issue on the Internet.
-- binary transport (phasing out C-T-E's)
Yup. An email should be simply a sequence of octets. (I'd vote for a
sequence of bits, actually. ;-)
-- Cleaner separation of header, envelope, and body
YES! And here's a great opportunity to reduce complexity.
Let's think of header and body as little more than arbitrary, stylistic
distinctions a la a proper letter you receive in the mail, as in:
To: Dr. Dolittle
From: Nurse Ratchet
Subject: Misbehaving Sheep
Date: today
It has come to my attention...
Why does present email even care about the distinction? Partly
because all sorts of stuff is put into the header, such as "Received:"
and "Delivered-to:", that isn't really part of the *message*.
Instead, that sort of information belongs on the *envelope* that
contains the letter. It pertains to the *transmission* of the
message.
If we gain a clear idea of what should be on the envelope, i.e. what
is necessary and desirable for transport, and leave the header and
body as "content" generally and typically controlled entirely by the
originating sender and not otherwise modified, we'll probably leave
the world a better place when we die.
-- structured local-part syntax
See above.
-- economic mechanisms (postage, attention bonds)
These would go into the envelope, I would think. I don't know enough
about them to be sure; and, to the extent these might be better
addressed by the Internet as a whole, let's think more in terms of
simply including a general means to accommodate such mechanisms.
- improving reliability
Reducing complexity of the design helps a lot right off the bat.
- improving transparency of the MTS (which is a broader topic than
binary transport)
Not sure what this means.
- improving error reporting (see below)
I'm of two minds here. Generally, it seems as though *allowing*
better error reporting is happy, but *requiring* it (in a design) is
unwise, because it tends to force systems to expose more about their
inner workings to arbitrary outsiders; further, improved error
reporting *usually* goes in the opposite direction of "lightweight
response".
And, I think this can be viewed as a general issue: why exactly is my
DNS lookup not resolving? What's the holdup on this web page?
So, maybe we consider leaving this to general Internet mechanisms to
notify with, request, and provide information regarding behavior and
performance of protocols, mechanisms, and so on.
The biggest improvement, in my opinion, would be to restrict error and
status reporting to specific codes not in any human language; think of
the 4.7.1-type codes being the sole information provided on the
status. Leave it up to software agents displaying the information to
translate it for the user.
And, to continue the pathname metaphor, maybe switch over to "4/7/1",
allowing arbitrary numbers and number of members, with the only
requirement pertaining to the highest-level numbers?
That might naturally extend to a scenario like this: an MTA wants to
report a fairly unusual, specific status. It's a particular form of a
more-general status, I'll make one up sorta, 4/12/2, that's already
widely known.
So, that MTA reports 4/12/2/9/6 as its status.
A User Agent (UA) seeing this status could say to itself, hey, I know
what 4/12/2 means, so I could just report that and its meaning, which
might be as easy as pulling up the following URL for the user:
http://emailsystem.net/status/en/4/12/2/
That's the English (en) page describing that status.
Or, it could try asking the reporting MTA via a canonical means:
http://example.com/email/status/en/4/12/2/9/6
And, maybe there's a protocol allowing for specifying "global"
extension of the status-code space. In fact, it might make sense to
allow insertion of a *name* into the path to connote a vendor-specific
"branch", a la:
4/12/2/qmail/9/6
Since such status codes are *general* Internet issues, these
mechanisms should be punted "upstairs".
But we could get the ball rolling, since email seems to represent, if
not the tip of the spear, a portion of its head, when it comes to the
whole point of the Internet?
- clean separation between submission and relaying
Could someone elaborate on this? It sounds very email-specific, so
I'd like to understand just what is being thought of here.
- reducing reliance on store-and-forward
I'm not quite sure what is being referred to here, so would appreciate
some expansion.
Still, one of the things email *does* uniquely provide among the
Internet's general technologies *is* store-and-forward.
Widespread constant uptime, connectivity, and high bandwidth doesn't
just make email "better" -- one could argue these have made it worse,
thanks to spam and vermin -- they almost make it irrelevant, given the
availability of instant messaging, TXT messaging via mobile phone, and
the like.
If email didn't provide store-and-forward, I think its "uniqueness"
would be rather diminished, though not entirely. (Longer-term and
more-affordable addresses versus domain names is a unique feature; not
everyone can afford to buy and hold onto a domain name, while email
addresses with domain names are relatively cheap to maintain, since
they don't connote or correspond to a unique IP address, a unique set
of ports, and so on.)
- better support for "integration" between various messaging services
I don't know enough about those services to say, but if we design
email "right" -- towards simplicity, transparency, and so on -- we
might take care of this without even trying to.
- improve configuration (for both users and admins) so it is less error-prone
Again, that underscores the importance of (and is best dealt with via)
leveraging what is being provided elsewhere and has been learned.
One item that I would like to add is designing in security features from the
beginning. One of the current issues that it can be applied to bodies, but
not to things like envelope.
I'm not all that excited about security features in email. I know,
everyone will yell at me for saying that, but, seriously, don't most
of us get along just fine sending email in the clear *today*?
And, isn't security more of a general issue anyway?
Instead of things like SMTP AUTH, why not punt that to the overall
system, a la tunneling through ssh, or some future form of that
approach that takes care of the problem that way?
After all, if we want the new email system to work well, it has to be
adopted widely, which implies widespread use *within* organizations
that don't require security (or cryptographic encoding) for internal
use.
Why should the new email system impose on itself the burden of being
uniquely secure in an environment that doesn't need that level of
security?
That being said, if we're going to throw UDP packets around, we should
take whatever steps are necessary to make them less than totally
trivial to forge, intercept, misinterpret (due to typical transmission
errors), and the like. Whatever is done, or would ideally be done (if
it, in turn, was designed from the ground up) for DNS within a LAN
full of mutually trusting systems, should be minimally sufficient,
with provisions for cooperating entities to request one or two levels
more of encryption, double-checking, etc.
Here are some things I think we should also consider:
- Make transport protocol filterable by middleman
I.e. make it easier for an intermediary, such as a Broadband ISP,
to monitor an outgoing submission, and possibly take actions it
deems necessary to enforce its AUP, with otherwise minimal
intrusion on the connection.
Right now, such an intermediary is faced with a problem if it
notices spam or a vermin in the DATA portion of an SMTP connection:
how to terminate that *portion* of the connection without
terminating the connection?
It can't, since the only way to terminate the DATA portion is to
send "\r\n.\r\n", which means "end of message, queue it". So it
has to drop the connection instead, which screws up the client,
possibly revealing the existence of such a middleman, certainly
leading to erratic behavior in the mail system overall, as more
such middlemen exist for various reasons.
So, make sure the protocols allow distinctions between concepts
like "here's the end of the data stream" and "please accept the
data stream for queueing", so the latter can be replaced on the fly
with "please discard this data stream" while still ending it.
- Always restate the original goal in protocols
A la virtual hosting in HTTP, after looking up example.com and
connecting to its IP address, start off the conversation with "I
connected to you because I wanted to talk to example.com".
This might make setups like virtual hosting easier to deal with.
(Though, since it's a nice optimization to notice that three
different hosts to which you're sending an email resolve to the
same IP address, it'd be nice to be able to tell the host about
that serendipity.)
- Allow for small talk
One way to discourage spammers, vermin-injectors, and other forms
of annoyance -- in real life as well as in cyberspace (does anyone
still use that term? ;-) -- is to engage those who start up a
conversation with you, towards a supposedly legitimate goal, in
small talk that requires them to listen, think, and respond,
throughout the "purposeful" portions of the conversation.
As designers and engineers, we tend to think purely in terms of
getting the job done right, fast, and cheap, so when we design
protocols, we rarely include the ability for one party or the other
to "chat" or comment on what's going on.
Right now, SMTP clients can require servers to engage in small talk
(RSET, VRFY, EXPN, HELP), but servers can't really do that to
clients, except by, e.g., sending a long, multi-line response to
one of their commands. That's actually kinda backwards; the client
is doing the "injecting", so the server should be given the
advantage, in the same (yes, this is sexist ;-) sense that when a
man asks a woman out on a date, she "gets" to engage him in all
sorts of conversation and otherwise set her own comfort level of
expectations before being required by *protocol* to immediately
accept or reject whatever he's really asking of her.
This small talk could take the form of requesting evaluation of
simple arithmetic expressions and sending the results back, and
should be allowed pretty much anyplace in a conversation.
So, a server that believes the client injecting (submitting) the
email is just a spambot or verminbot can resort to various sorts of
small talk to ensure there's a real client willing to spend some
extra time and energy getting the message across, and risk being
"found out" by being online longer if the message is not in fact
legit.
This is akin to how some small towns have redesigned the roads
going through their downtown section to naturally discourage
speeding, making for a safer environment downtown.
It goes against the idea of keeping servers and clients simple, so
in that sense it's anti-complexity-management.
But, from a bigger picture, since it should be a useful tool in
managing the "real world" use of email, it might serve to reduce
complexity overall: other, more complicated and error-prone means
to defeat abuse of services might not be needed as often.
And, a client would always be free to say "no, thank you" in
response to any of this small talk. A server could then decide how
to respond to that. If it didn't want to accept the email, or if
it wanted to accept the envelope and ask for the body later, or ask
for the body but not accept responsibility for it until later (all
of which *should* be options the server can choose), then the
client could be given another opportunity to respond, or just
(essentially) decide the message wasn't sufficiently important to
send.
- Uniqueness: distinguish *messages* from *transmissions*
A message is the contents within the envelope, so to speak: the
header (without the stuff that I think should be in the envelope in
email) plus the body.
*That* message should have a unique ID (UMID). It might have
multiple UMIDs, of course, since in real life we can't immediately
know whether two people holding two letters are looking at the
exact same letter, nor can we use such a stricture to avoid the
case where two people coincidentally formulate the exact same
letter and each give it their own UMID; so let's not try to impose
such a requirement on email submitters. (Content equivalence is
presumably better addressed by things like MD5 and SHA1?)
But, a UMID cannot pertain to two different messages; if it is
shown to do so, the conclusion must be that the submitter is broken
or lying (or maybe the UMID was itself mistransmitted, always a
possibility).
I don't think it's necessarily helpful to require UMIDs to be
provided by message submitters. I *do* think that, if one is not,
the *email system* should not allow for automatically inserting one
into a message or into its envelope, since that would imply the
originator provided the UMID. (The "originator" includes the human
plus whatever software he used to compose and submit the email, so
naturally that software gets to insert the unique UMID.)
Without a UMID, however, certain features of the email system and
related add-ons wouldn't work, or not as well. The idea here is
that if the user's own hands-on software can't feasibly keep its
own data base of UMIDs and related messages, no other entity should
pretend to do so on that user's behalf.
(Put another way, an email sender has a computing system acting as
his own "agent". So does an email recipient. In between these two
agents are other agents, but they should be thought of as agents
for the proper and desired transmission of the email; they're not
acting fully as agents for the recipient or the sender, since no
agent can serve two masters. The sender's agent either supplies a
UMID, or there is none that ever attaches to it. An intermediate
agent, since it cannot assuredly contact the sender as the sender's
own agent presumably can, must not create its own UMID, even just
to be "helpful", since the existence of that UMID would imply, to
downstream handlers of the message, things about the sender's
contactability and his agent's capabilities that are not
necessarily true or desired by that sender.)
A *transmission*, on the other hand, is an envelope plus its
message, and that should have its own unique transmission ID
(UTID).
The UTID is what mechanisms such as DSN (via UDP, above) would
pertain to.
It *could* be required for any MTA that takes responsibility for a
transmission to gin up its own unique UTID when delivering a
message downstream, though, again, maybe "thin" MTAs should be
allowed the option of omitting UTID's if they don't want any DSNs
coming back their way.
So, in today's terms, if an SMTP server gets as far as seeing the
"DATA/message/\r\n.\r\n", sending the 2xx response accepting the
message, but then sees the connection drop, it can remember the
UTID as it delivers the message further.
That way, if the SMTP client attempts another delivery, when it
provides the envelope (which contains both the UMID and UTID), the
SMTP server can recognize the UTID from before and say "never mind,
I got that message earlier".
- Allow various combinations and levels of envelope and message
acceptance
Levels: 0) Reject, 1) Defer, 2) Will call, 3) Accept for attempted
delivery, 4) Accept responsibility for delivery, 5) Delivered.
Combinations: A) Envelope; B) Envelope and Message.
Since B) include A), one cannot accept an envelope and message
while rejecting the envelope, so the protocol need not allow that.
But it should allow an MTA to say that it has accepted
responsibility for delivery of the envelope, but has designated the
message itself as "will call", meaning it'll get in touch with the
appropriate agent when it wants to retrieve the message itself.
And these combinations and levels should be expressible pretty much
anytime during the protocol that they make any sense.
E.g. after an envelope is transmitted, E2 makes sense, because the
server can record just the UTID for the envelope and ask for it
later. (This allows the server to avoid storing very long lists of
recipients and/or very long envelope sender and recipient
addresses.)
And even after everything is transmitted, E2M1 would still make
sense, because the server could decide, based on message content
for example, that it didn't want to keep more than the UTID and
other minimal info pertaining to that UTID to ask for the envelope
later and wait for the email message to be delivered later.
(E[01345]M2 after envelope transmission is basically djb's im2000
proposal, as I understand it.)
- Allow originators and clients to request, of downstream MTAs,
levels of handling of envelope/body a la the levels/combinations
above
A client on a laptop about to be disconnected needs to be able to
request E4 handling. If it gets anything less, it might want to
try another MTA right away, even if that means multiple
transmissions (here's where a UMID helps avoid multiple messages in
a smart recipient/agent).
Or, a client that knows it's sending junky/bulky email might
request E[23]B2 as a means to say "I don't want you to have to keep
the message itself on your system; I'm always ready to deliver it
when you need it".
This is a *request*, however; since the purpose of the protocol is
to inject email, a server is always free to respond however it
wishes.
- Don't tie UTIDs and UMIDs to hostnames or IP addresses
Since there's the general issue of locating (possibly traveling)
resources on the Internet, think of UTIDs and UMIDs as pertaining
to *agents* acting on behalf of those transmissions and messages,
not necessarily to specific users of email, specific MTAs on
specific hosts, and so on, when designing protocols.
- Expiration dates and times, possibly even events, on envelopes
Lots of email expires as of a certain time and/or the occurrence of
an event. Let the originator/agent express that in a canonical
way. No requirement for MTAs or recipients to honor such things,
of course.
These are basically all of the email-specific ideas I've been jotting
down over the past few months. Hope it helps.
--
James Craig Burley
Software Craftsperson
<http://www.jcb-sc.com>
--Fix qmail's qmail-smtpd so it doesn't crash on a big header line:--
<http://www.qmail.org/netqmail/>