Re: Mail Routing and LDAP

  1. UMich's approach (implemented in the "mail500" mailer)
  2. Netscape's approach (implemented in the Netscape MTA)
  3. Stanford's approach (implemented in Sendmail 8.8.X)


There is another approach which I think is worth consideration.
Bell Labs has been using it for about 10 years now, and it's been
extremely popular.  We're doing it with a proprietary protocol called
POST, with a relational database, but the same concepts and functionality
would seem well suited to LDAP.  Indeed, we'd like to migrate to LDAP
but keep this functionality, and it won't work unless this is somewhat
standardized.

(1) Deliver atom(_at_)lucent(_dot_)com by looking it up in the directory,
and delivering to the physical address found in the directory.

(2) Pass directory(_dot_)query(_at_)lucent(_dot_)com (anything with syntax 
containing
dot, equals, slash, colon, or underscore, followed by @lucent.com) to the
directory lookup to match one or more possible people and deliver to
their physical addresses.  Examples of our current syntax:  (my directory
entry is "Mark R Horton".)

Mark(_dot_)R(_dot_)Horton(_at_)lucent(_dot_)com        (full name match)
Mark(_dot_)Horton(_at_)lucent(_dot_)com          (partial name match)
M(_dot_)R(_dot_)Horton(_at_)lucent(_dot_)com           (partial name match)
m(_dot_)horton(_at_)lucent(_dot_)com             (ambiguous - return to sender)
mark(_at_)lucent(_dot_)com                 (looks up based on "id" field, see 
(1) above)
mark.horton/state=oh(_at_)lucent(_dot_)com
ou=bl0311100/all=y(_at_)lucent(_dot_)com  (broadcast to all in department)
tl=tmgr/loc=oh0012/all=y(_at_)lucentcom (broadcast to all tech mgrs in bldg 
oh0012)
state=nj/all=y(_at_)lucent(_dot_)com       (too many people - send to human 
moderator)

(all=y says user intends to match > 1 person and this is OK.  Upper/lower
case is ignored.  . and _ mean the same thing, : and / mean same thing.)

This is a very powerful system.  Most users depend on it to broadcast to
everybody in a building, or in a geographic area, or in an organizational
unit.  But if we decided to implement an attribute called shoesize, a user
could send mail to everyone with shoesize=8 without any special coding
or mods to the mail system.  We can't anticipate all the combinations
that users want, but with this system, we don't have to - they all work.

Note that this uses standard SMTP and is interpreted locally, so it does
not depend on special URLs or changes to somebody else's mail system.

Note also that this does *not* force mail to be routed through one machine
called "lucent.com".  "lucent.com" is a domain, not a machine.  Any machine
inside the domain can do the LDAP lookup and deliver directly.  Any machine
outside the domain will follow MX records to get into the domain at some
point, not necessarily always the same point.  Machines inside the domain
that don't understand the protocol will also follow (possibly different)
MX records.

And note that, because of the client/server nature of LDAP, it is not
necessary to have a separate copy of the directory on every mail server,
so you don't have to spend all your resources doing directory synch.
Have just enough LDAP servers to handle the load and be redundant - we
have 3 POST servers to handle hundreds of mail servers in Lucent.

What it does require, if you're going to use standards-based components,
then the LDAP client (MTA) and LDAP server have to agree on a few things.
Most notably, they must agree on two standard attributes: the high-level
user-visible mail address (such as "mark(_at_)lucent(_dot_)com") and the 
physical
address specifying the mail server 
("mark(_at_)clipper(_dot_)cb(_dot_)lucent(_dot_)com")
The job of the MTA, even in the simplest case (1 above) is to take a
high-level name, look it up in the directory, and get back the physical
address so it can route the mail there.  For example, the first could be
called "mail" and the second "MailForwardingAddress".  Or it could look
for attribute "id" value "mark", and still return a MailForwardingAddress.
At a bare minimum, we need to agree on these two attributes.

We could go further and declare RFC-xxxx which specifies the above syntax,
lets each implementation define their own attributes.  The RFC would have
to say something about broadcasts, and define the semantics of "and" and
"or".  (Our implementation is simple: multiple values of the same attribute
are "ored" together, different attributes are "anded".  This works well in
practice.)


Let me say a few words about broadcasts.  The normal behavior if you type
something ambiguous, like "m(_dot_)horton(_at_)lucent(_dot_)com" which matches 
Mark, Mary,
Matt,... is to reject it - we assume the user made an error and should
have been more specific.  So if the query matches more than 1 person, we
return it to sender (or, ideally, refuse the SMTP so the client can offer
an opportunity to fix it.)

But sometimes you want to broadcast to everybody whose title is "peon".
tl=peon(_at_)lucent(_dot_)com would be ambiguous and would bounce.  So we put in
an "all=yes" or "all=y" attribute, which forces the user to say "yes,
dammit, I really wanted to do that" and let the broadcast go through.
Thus, tl=peon/all=y(_at_)lucent(_dot_)com goes to all the peons.

We've also found we have to guard against broadcast storms.  When you send
mail to 20,000 people by accident (and people do this routinely), if you
don't do anything to prevent it, you get the same old pattern:
A:      Hi, all!  I'm back!
B:      Welcome back, A!
C:      Who's A?
D:      Stop sending all this mail to 20,000 people!
E:      Take me off this mailing list.
F:      Take me off, too.
G-Z:    Me too.

Our solution to this problem is to implement a limit.  The limit is has
a default (2500 in our case), and can be changed by authorized senders
or in the MTA configuration.  When a broadcast is expanded by the MTA,
the number of matches is counted.  If it exceeds the limit, we silently
forward the message to a configurable Internet e-mail address, the moderator.
(Another implementation is to just reject the mail, but there are often
legitimate needs to exceed the limits.)  The moderator address goes to a
person or group whose job is to read such messages and make a judgement
whether it's a reasonable message and audience.  For the 95% that are
reasonable, they are passed back to the MTA with a higher limit.  For
the mistakes, local policy is followed (we phone the sender and ask if
they really wanted to do that, usually they thank us profusely for
catching their mistake.)


Our system works very well in practice.  Lucent has about 160,000 people
in our directory.  We've been supporting this type of e-mail delivery for
over 10 years, in full production.  We had enough faith in the system to
decree two years ago that all employees use the handle(_at_)lucent(_dot_)com 
address
form, which always causes a directory lookup.  We haven't regretted it,
except that existing vendors of e-mail products don't support it as well
as our internally developed system.  Perhaps with an RFC we'll all be
able to reap the benefits of LDAP-based mail delivery.

        Mark