[Top] [All Lists]

Re: making mail traceable

2004-01-16 15:22:53

If we really want to make mail traceable, we need to do a bit more than
    fix Received.  As I see it, we need:

- A message hash function that is invariant across the various kinds of
    munging that happens in mail transport, but still good enough for
non-repudiation (though it probably won't be good enough to serve as a
    general-purpose signature)

- A new header field which associates the message hash, originator-id, timestamp, and originating ISP or organization, which is signed by that
    originating ISP or organization, and which is easily verifiable by
    recipients or MTAs

Doesn't RFC1847 (security multiparts) already provide both of these, or
at least the framework sans the actual algorithm?

Only if everyone's user agents treated a security multipart containing a message semantically the same as they would the original message. With current user agents, having submission servers push messages down into security multiparts would change the way that messages were interpreted, processed, and presented by a significant number of intermediaries and MUAs.

- An originator-id separate from From, MAIL FROM/Return-Path, Reply-To
    or Sender that uniquely distinguishes the originator of the message
from other message originators. it doesn't have to actually expose the originator's name, email address, account name, etc. - it could be a nonce as long as the originating ISP or organization could trace it to
    the actual originator within a reasonable time.

So you want something functionally like the ident protocol built into
the Received header?

There's at least a vague similarity between the ident protocol and one piece of the package I have in mind, but it might be confusing to make too many associations between the two.

- A way to ensure that messages get tagged with originator-id when they are injected into the mail systems (e.g. ISPs blocking port 25 and/or MTAs refusing to accept incoming mail without originator-ids or with
    unverifiable originator-ids)

I agree in principle, I think.  But couldn't a Received already provide
this information if everyone would just do it, i.e., it's optional now?

The two are fairly different. You want to capture the originator-id at submission time, not at every hop. And you need protocol elements that Received (not being extensible) cannot convey. To me it makes more sense to start from whole cloth.

    - If you really wanted to, you could augment Received or add a new
trace field that recomputed the hash at each hop (to show if and where
    the message was corrupted in transport)

Why would you forward a message that was discovered to be corrupted?

Probably because the message might have been corrupted in a way that makes the hash invalid without actually changing the content of the message. Something tells me it will be difficult to write a canonicalization function that accommodates all of the various kinds of message and header munging that is out there.

This would give you a way to associate each message with an identifier
    for the originator, issued by the originator's ISP or organization.
Then you'd need some way to ask that ISP or organization "is the guy who sent this message trustworthy?" And they could say "as far as we
    know, he doesn't have many abuse reports and he's been with us for
years" or "he just signed up yesterday" or "this is a trial account, we
    have no billing information for him" or "we've had several hundred
    abuse reports in the last 3 hours".

So you're looking for an extension to the ident protocol based on the
presence of a "string" inserted in a message by its original point of

Again, I think an analogy to ident would be confusing. You would not be asking this ISP to say "who is on the other end of this connection" - you'd be asking the ISP to tell you some information about the guy who sent the message with originator-id field "XXXXX". (Though it is conceivable that a first-hop MTA might want to use something like ident as a means to obtain originator-ids for messages that don't have them, that's not what I'm proposing now.)

But do we really want traceability? Or to put it another way, do we
    really want to put hooks in the mail system that make mass
    surveillance (by governments, or perhaps even by large companies or
    unscrupulous ISPs) that much easier?

I'm sorry but I just don't see what you have in mind that is worse than
it is today,

Given the apparent aspirations of the current occupant of the US White House, things could get far, far worse than they are now. I'd far rather have spammers than Big Brother George any day.

Can you please elaborate on the "mass surveillance" you fear?

Have you read the Patriot Act lately?

Okay, I'll be more specific. If every message has an originator-id tag that can be traced to the origin, it becomes fairly simple for the US Government to insist that all US ISPs (and perhaps, all foreign ISPs peering with US ISPs) give them a list of mappings from tags to more recognizable identifiers (such as credit card #s), or that the ISPs generate tags in such a way that the USG can simply decrypt them to obtain those identifiers. Given the kind of stuff that is already authorized by existing laws, (and if not authorized, obtained by coercion of various kinds) this isn't much of a stretch. And everybody knows that terrorists use email...

Anyway, the reason my proposal allows originator-ids to be ephemeral is to make it hard for ordinary people to track messages - I believe anonymous speech is important - and also out of recognition that spammers can probably get ephemeral accounts anyway. The solution to ephemeral accounts is to provide a way for recipients to distinguish these from ones where the ISP really does know who sent the message. But I really don't know how to make messages more traceable on one hand and not enable more surveillance on the other.