Re: a header authentication scheme


Ok, thanks for the clarification. Can I summarize your position in the
following terms? Mail transport is in essence an end-to-end service which
cannot be trusted by anyone due to the potential for unethical
intermediaries to change the content of messages against user's
wishes.

If this summarizes your points, then I'm not sure how productive such
an extreme position can be on this issue. It seems to me that the
underlying assumption of mail is that people do at least trust the
transport infrastructure to faithfully replicate the sender's message
at the destination. Otherwise there can be no communication at all.

On Nov 04 2004, Bruce Lilly wrote:

So we always have a model of message transport as follows:

A -> M1 -> M2 -> M3 -> ... -> T1 -> T2 -> ... -> B


Not necessarily.  There may well be multiple paths to B, N-1
of which do not pass through "T2".  "T1", "T2", etc. might
pass some messages w/o change, but may change some
fraction of messages (i.e. those messages not sympathetic
to the interests of the owner/operators of those systems).


I didn't expect this picture to specify a single path from A to B
which must always be used by all messages from A to B, rather it
represents the actual path taken by a given message. "T1" and "T2" are
whatever makes sense for that actual path. My assumption which can be
challenged is that all paths to B are covered by a small number of transport
agents which can be reasonably labeled by B as being of type T.

A is the sender of the message, B is the destination, M represents
computers which are untrusted by B, and T represents computers which
are trusted by B and add a Received field.  None of the computers of
type T will ever perform either a) or b) above, unless they have been
compromised.


No, repeating "unless they have been compromised" doesn't
make it true.


Instead, your examples so far invoke institutionalized corruption by those
computer systems. I don't find that very convincing as a counterexample.

Computers of type M may well perform a) or b). It is up 
to B to only trust Processed fields associated with computers of type
T.


You are assuming an extraordinarily simplistic model which
simply doesn't conform to reality; a model where there are
only "M" and "T" and where each system is either one or the
other, invariant with message content and over time, with no
message transport paths through other mechanisms.


No, I'm expressing the essential framework needed for the scheme in the
simplest terms I can think of. I accept the criticism that a message
being sent from A to B without using SMTP at all would be outside the scope
of consideration.

But are you really saying that a given user B
cannot, by examining a sample of past mail messages he personally received,
deduce the form of the Received fields added by the computers I've labeled
T above?


It certainly depends somewhat on B's knowledge and experience.
I would estimate that for > 99.9% of email recipients, a *valid*
deduction would be extremely unlikely; probably 99.5% have no
clue what a Received field is in the first place.  Judging by illegal
syntax in many generated Received fields, it's clear that a large
fraction of those who think that they know what a Received
field is are simply wrong.  On top of that, there are hostname
aliases, domain literals, etc. which require an additional
understanding of the DNS.


I'm sorry, this is my fault for badly expressing myself. Asking
humans to perform header analysis individually is not of any great value.

Instead, a user can feed sets of received messages to a datamining
tool which extracts the salient features which characterize his SMTP
neighbourhood.  For example, important hostname aliases and domain
literals will be automatically highlighted by their frequency of
appearance in received mail.  This requires a representative sample of
messages arriving at a single destination address, but no detailed
understanding of the mail server configurations and local network geometry.

But if the user has to run a local filter in order to determine whether
or not to trust a remote filter, what is the point of having the
remote filter in the first place?


Quite simply diversity. Combining two or more filters with different
strengths is better (allows more accuracy) than choosing to use a single
filter with known limitations. It also allows specialization, and 
smaller resource usage.

Filtering that takes place at the receiving endpoint is consistent
with the Internet Architecture.


I agree. It's just not reflective of reality, unless you pick various
intermediate points and call them endpoints, such as you did with the
example of mailing lists.

* years ago, things like X-RBL etc. were sometimes inserted by ISPs;
   some users complained about them (apparently believing that it
   involved eavesdropping, which was not the case), but they were
   largely useless because of DHCP and other types of shared IP
   addresses, shared IP address blocks, etc.


Which doesn't mean it's no longer used. I don't care what individual schemes
ISPs come up with next, I want to see whether their schemes can be traced
back to them through a lightweight method.

* About 6 months ago, my ISP tried to implement message mangling
   w/o user input or consent.  That quickly went away after I
   complained about a legitimate mailing list message which was
   modified by the ISP (among other things, "*Possible Junk Mail* "
   was prepended to the message Subject field).  That *was*
   eavesdropping; it was forgery as well.


Good for you. Judging by the popularity of Gmail, eavesdropping isn't 
much of an issue to many people, nor is forgery. The world would be 
better if more people believed in privacy as you do.

* A large number of MUAs have spam filtering capability


MUAs have had spam filtering capabilities for years, starting with 
keyword searches in the subject line which had to be entered by hand.
Mozilla has filtering capabilities, but it's not very good compared
with alternatives. Alternatives insert X-headers. Corporate
filters add X-headers. Who did what? Back to square one.

-- 
Laird Breyer.