Re: a header authentication scheme


On Sun November 7 2004 22:47, Laird Breyer wrote:


Ok, thanks for the clarification. Can I summarize your position in the
following terms? Mail transport is in essence an end-to-end service which
cannot be trusted by anyone due to the potential for unethical
intermediaries to change the content of messages against user's
wishes.


No, that would be inaccurate. Message content can be signed
(at the originating endpoint, and the signature verified at the
receiving endpoint) to provide protection against undetected
modification in transit.

If this summarizes your points,


It doesn't.

It seems to me that the 
underlying assumption of mail is that people do at least trust the
transport infrastructure to faithfully replicate the sender's message
at the destination.


Transport is designed to be transparent, but it sometimes
fails to be for various reasons; there is at this time another
discussion on the ietf-822 list related to such a possible
failure.  Moreover, your proposal relates to content added
by an intermediary, which is a completely different
matter from "the sender's message".

My assumption which can be
challenged is that all paths to B are covered by a small number of transport
agents which can be reasonably labeled by B as being of type T.


And as I pointed out:

You are assuming an extraordinarily simplistic model which
simply doesn't conform to reality; a model where there are
only "M" and "T" and where each system is either one or the
other, invariant with message content and over time, with no
message transport paths through other mechanisms.

Instead, your examples so far invoke institutionalized corruption by those
computer systems. I don't find that very convincing as a counterexample.


"Corruption" as you are using it is not a technical term.  My
examples were intended to illustrate the fact that there may
be cases where some party labels what the recipient might
consider to be spam as non-spam or vice versa. I have
given examples where that may be intentional (specifically
to refute your implication that there is no reason for an
intermediate to do so); it is also possible due to incompetence,
difference in values, etc. If your proposition is that "all birds
are either black or white", I can show that to be a false
proposition by pointing to a single bird of a different color,
brown for example.  I do not need to produce every brown
bird, nor do I need to produce red, yellow, green, blue, or
gray birds as well, or any birds that have different colors at
different points in their life cycle; one counterexample serves
to expose the flaw in the proposition.  That's fundamental
logic, whether or not you "find that very convincing".

Instead, a user can feed sets of received messages to a datamining
tool which extracts the salient features which characterize his SMTP
neighbourhood.  For example, important hostname aliases and domain
literals will be automatically highlighted by their frequency of
appearance in received mail.  This requires a representative sample of
messages arriving at a single destination address, but no detailed
understanding of the mail server configurations and local network geometry.


If such a hypothetical tool existed, it would be of little help;
it may determine that there is some mix of messages that go
via system "foo", and that some of those are labeled "spam"
and others are labeled "non-spam".  That is of absolutely no
use in determining whether or not a particular message which
has passed through "foo" is labeled in accordance with the
recipient's values.

But if the user has to run a local filter in order to determine whether
or not to trust a remote filter, what is the point of having the
remote filter in the first place?


Quite simply diversity. Combining two or more filters with different
strengths is better (allows more accuracy) than choosing to use a single
filter with known limitations. It also allows specialization, and 
smaller resource usage.


Use of multiple filters does not require that any of them be
"remote".  Having a local filter and a remote filter does not
guarantee different "strengths" (for any definition of that
rather nebulous term).  I believe that you would have a hard
time showing that in general having both a local filter and a
remote filter uses fewer total resources than a local filter
alone; resource usage may well be distributed differently,
but is likely to be greater. Aside from that, smaller resource
usage -- even if that were the case -- would not be an
advantage if it led to incorrect results.

Filtering that takes place at the receiving endpoint is consistent
with the Internet Architecture.


I agree. It's just not reflective of reality, unless you pick various
intermediate points and call them endpoints, such as you did with the
example of mailing lists.


In a message sent from blilly(_at_)erols(_dot_)com to 
ietf-822(_at_)imc(_dot_)org,
the latter is unquestionably one endpoint of that communication.
With regard to the Internet protocols, it is in no way
"intermediate".  Likewise a message sent from
owner-ietf-822(_at_)mail(_dot_)imc(_dot_)org to laird(_at_)breyer(_dot_)com has 
those
mailboxes as endpoints.  That some message content may be
the same due to re-mailing doesn't change those facts.  If
ietf-822(_at_)imc(_dot_)org were to vanish, that would break the first
path because one of the endpoints disappeared. Disappearance
of some intermediate system (in a properly-configured system,
e.g. with at least two MX hosts) would not break that
communication path.

MUAs have had spam filtering capabilities for years, starting with 
keyword searches in the subject line which had to be entered by hand.


The Subject field is not designed to be a spam indicator; it is
intended to be set by the author as an indication of the message
topic, however there is no guarantee that authors will not abuse
the field -- if you expect spammers (and/or vested corporate
interests, and/or individuals or groups with some axe to grind
(politicians, political parties, governments, preachers, religious
groups, etc.) to always tell the truth, you are in for a great deal
of disappointment. Searching Subject fields alone will therefore
not be reliable for spam detection.  And mere detection is not
the same as filtering.

Mozilla has filtering capabilities, but it's not very good compared
with alternatives.


Mozilla has a variety of individual capabilities which can be
combined to perform true filtering at the MUA level.

Alternatives insert X-headers. Corporate 
filters add X-headers.


Adding cruft isn't "filtering". It makes the problems (transmission
bandwidth and storage capacity requirements) worse.  True
corporate filters (as opposed to "filters" that do no filtering) can
and do reduce the problems; but neither they nor any other true
filtering method have any need to add the field that you propose.