Re: a header authentication scheme


<snip nontechnical discussion>
(I'm willing to respond to those points if you insist, but they
seem to me to be matters of opinion which bring nothing of value)

On Oct 30 2004, Bruce Lilly wrote:

That's the whole point of the discussion we're having. If you know how
to successfully forge the Processed field as I originally described it
and its variations as already discussed with other list members, please
give a concrete scenario.


Given a message with a valid Received field; it's trivial; simply
construct a Processed field accordingly and insert it where
desired, eliding any preexisting "processed" field that corresponds
to the Received field.  W/o any Received field; forge one and
insert it and a corresponding "processed" field where desired.


I'm afraid none of this is yet convincing as a successful attack. 

You must be able to forge the Processed field before the message
contains the pertinent Received field, otherwise you're not actually
forging anything of value. In fact, quoting existing Received fields
is legal, and marks your location correctly. 

If you forge an appropriate Received field as well, you still have no
control over the Received fields added by the subsequent receiving
systems, which will always have precedence. Nor can you assume that
the ultimate reader will trust your forged Received field, whence your
associated Processed field.  So your forgery will be trivial to detect
at the destination, unless its trust policy is very permissive.
That's not much of an attack as it stands, if you rely on cooperation
from the victim.

If you simply rewrite an existing, presumably syntactically valid,
Processed field, then you are at the same point as in the previous
paragraph. Alternatively, it means that you have found a way to
compromise a notionally trusted mail system close to the
destination. In those cases, there are much more interesting things to
do than forge a Processed field.

I don't yet see any actual effective attack in any of your suggestions,
but see below.

Bear in mind the fact that not all message transfer involved SMTP,
but that inly SMTP requires inserting a Received field (and
moreover, there is no enforcement of that requirement).


That's a good point. I believe the Processed field degrades gracefully
in this respect, since if there is no existing Received field to be
quoted, a Processed field cannot quote it. A later reader of this
Processed field will be immediately alerted to the fact by the absence
of an "auth-received" key value.

Neither does the Processed field. Its keyword/value pairs will have
standard meanings which can be interpreted without the need for
configuration information.


Not so.  Given your example
   Processed: name="SpamAssassin"; location-ip="1.2.3.4";
       version="2.63"; function="spamcheck";
       auth-received="B17C07174B5E4546A2B04EB096E83FD7081936B8"; 
       result-tag="spam";
I cannot interpret (in any meaningful sense) the "result-tag='spam'"
part without knowing specific configuration information and run-time
environment information for the instance of "SpamAssassin" version
"2.63" from which the purported "result-tag" was allegedly obtained.
I know nothing about its spam threshold, the myriad of fudge factors
that are used to weigh various characteristics, whether any address
patterns are used as whitelists or blacklists and if so what those
patterns are, etc.


This makes no sense as an objection. The Processed field is a record
of what some other process computed about the message, it has no
other meaning.

All you need for interpretion is to know what this other process
intended the fields to mean, and that's to be detailed in a common
standard.

As long as processes follow the standard, the meaning will be clear,
viz.  "the process writing this field considers the message to be
spam". There is no need to second-guess why, how etc. It's a record
of some other process' computation, for informational purposes. If that
process wants to express other characteristics, it will do so with 
appropriate key/values.

You're sidestepping the issue. I've given an example (Reply-To) where
you (and I and we all) act based on zero hard facts about the writer
of the Reply-To field. If we can act on unreliable fields such as
Reply-To, why can't we act on fields such as Processed? What's the
fundamental difference you perceive and are pointing out?


The fundamental difference is that the Reply-To field is specifically
provided as a means for the message author(s) to suggest a list
of mailboxes for responses. "Acting on" a Reply-To field presupposes
that one has already decided to make a response (prior to consulting
the field content) and therefore one already presumably already
knows *why* he is responding and therefore has some idea of where
his response should be directed. He can choose to respond directly
to the author(s), completely ignoring Reply-To, he can choose to
send his response to some specific place without consulting Reply-To,
he can choose any number of places to direct his response, one of
which is the author's suggestion.  I.e. one makes a decision (to
respond), another decision (where to respond), and in a subset of
cases, might subsequently consult a Reply-To field which conveyed
the original message author's suggestion; if one does not choose
to respond, the presence, absence, or content of any Reply-To field
is simply irrelevant, and in any event is not involved in the decision
*to* respond or not.  You are proposing that some decision be
based on incomplete and inadequate information contained in a
"processed" field.   I.e. the field must exist and must have some
content before the relevant decision (a guess, really) can be made.


If I may borrow your language: the Processed field is specifically
provided as a means for filter processes to suggest facets of the 
message as observed during transit. 

A subsequent message reader can choose to file the message away,
completely ignoring any Processed fields, he can choose to file
the message away, taking as a suggested destination some expression
derived from the contents of a Processed field, or offering several
possible destinations if there are several Processed fields, etc.

There is nothing inherently in a Processed field which mandates 
particular decisions to be taken by any subsequent readers. It is 
strictly informational.

You are proposing that some decision be based on incomplete and
inadequate information contained in a "processed" field.  I.e. the
field must exist and must have some content before the relevant
decision (a guess, really) can be made.


I'm sorry if I gave that impression. I merely intended to point out that 
decisions are made all the time by humans and machines. If an information
bearing field such as Processed exists, somewhere sometime it will be 
used for decision making. That does not mean that Processed would have any
authority whatsoever, beyond that which readers individually place into it
of their own will.

There are a multitude of X- headers used by filters, each with its own
syntax and interpretations, which are currently unparseable by all the
other filters.  For the subset of spam filters, a commonly adopted
format will allow better filter cooperation, and reduce the need for
future filters to reinvent variations on the same format.

Your proposed "processed" field differs from all of the above;
unlike Return-Path and Received trace fields, its content is
not well-defined.


Of course not, you'll have to wait for a proper draft.


You have missed the point.  A Received line "from" component
documents the domain name used by a client in an SMTP
transaction. That documents a fact about what was used. It
is not a value judgment and is in no way dependent upon
the configuration of the MTA that documents that fact. That
is the sense in which the existing trace field content is
"well-defined", and is not the case for your "result-tag" etc.
component.


I still don't see it. To paraphrase your request for a complete
record of the filter configuration: the Received line "from" component
is useless unless I have complete information about the state of the
DNS system in the world at the moment of the transaction, as well
as a complete record of which IP addresses are used in the world
at that point in time. So in this sense, the "from" component is
simply not well defined due to lack of information. 

-- 
Laird Breyer.