Re: a header authentication scheme


On Oct 28 2004, Bruce Lilly wrote:

On Wed October 27 2004 21:19, Laird Breyer wrote:

I doubt that. The current trend is well on its way to the following
model: ordinary, trustworthy people send spam to strangers without
knowing it.


No.  If they send spam, they are not trustworthy. If they run an
"operating" system with multiple known security vulnerabilities
without a properly-configured external firewall, they are not
trustworthy.


Now you're trolling ;-)

How does authentication help in that model? You'll know 
*who* sent the spam, but you can't stop them all (for if you did, you
will have solved a good part of the virus problem).


Knowledge of precisely who sent it is a necessary part of a
remedy. [and applies regardless of the specific remedy; whether
e.g. that involves removal of a "virus" or disconnection of the
machine from the Internet or both]


That's a generality which is as likely true as false depending
on the problem. In fact, your mention of virus removal gives a simple
counterexample: there is no need to know who sent a virus to be able
to remove it from an infected system. There is also no need to know 
who sent it to be able to block it from entering other systems.

If you want to restrict yourself to non-clueless users, then how do
you verify that a person sending you mail isn't clueless?


For a person known to me?  That's a judgment call.

And will you 
blacklist the others?


If the offense is repeated, yes.


It rarely is. The supply of clueless users is big enough that each one
need offend once only, and you'll be blacklisting day in, day out.

So in particular it can sign its messages with the user's private key
if the user has one. Then you could trace the spam back to the user
and that's it.


That user *is* the spammer -- his machine(s), running software that
he has installed and which is under his administrative control, is the
source of the spam, and the machine(s) involved is/are the ones
that need to be disconnected from the internet to curtail the
problem.


Ok, serious point: disconnecting the spamming computer from the internet
is not always an option. Who can disconnect the machine? The relevant ISP.
Why should the ISP do so? 

Take a provider like AOL, and let's say 30% of its users at one time
are infected with a spam sending trojan. Are they going to cut off 30%
of their customers for the dubious benefit of making life more
pleasant for people who aren't their customers? It's not exactly
increasing shareholder value.

The method proposed is not an oracle. It doesn't give a truth value
to the proposition "is the Processed field trustworthy?", instead
it gives a truth value to the proposition "given that the Received
line is trustworthy, is the Processed field trustworthy?"


Bad premises rarely lead to reliable conclusions. Not all Received
fields are "trustworthy".  Flawed logic also rarely leads to
reliable conclusions;


The bad premises are entirely yours to make or not. It's up to you whether 
you feel some Received lines are trustworthy (to you) or not.

It's perfectly ok to configure your software to not trust any 
Received lines at all, and therefore none of the Processed lines either.

Some other user, with different configuration, may elect to trust
some Received lines and not others, and therefore some Processed lines
and not others.

Both users can look at the exact same message and come to different
conclusions, as appropriate for each.

there is in fact no guarantee that a "processed" field or the
information that it purports to convey cannot be forged in the
presence of a valid Received field.


That's the whole point of the discussion we're having. If you know how
to successfully forge the Processed field as I originally described it
and its variations as already discussed with other list members, please
give a concrete scenario.

For example, take the variation we discussed where the Received line
is unfolded, and hashed (with md5 say), and the hash is quoted
in the Processed field. Describe a method which succeeds with high
probability in fooling a receiver into thinking a (forged) Processed
line was added after the topmost Received line seen by this receiver.

That is an excellent answer. The MUA told you it found a 
piece of information you might wish to use, and you told it
that you don't trust that information. Mission accomplished!


No, I did not say that I "don't trust that information"; I said
that there is insufficient information to make any determination.
Citing insufficient information to make a determination is *NOT*
that same as making a particular determination!


The MUA is pretty dumb. It only wants to know if it should act in 
predefined ways on the extra information or not. I take it you would
choose the "No" option.

That's a straw man. None of the headers in any mail message, ever,
carry full configuration information.


No currently standardized field purports to convey any information
which requires detailed configuration information to be able to
interpret the conveyed information.


Neither does the Processed field. Its keyword/value pairs will have
standard meanings which can be interpreted without the need for
configuration information.

Do you or any recipient know how 
exim or Postfix were configured when they added their trace fields to
transported messages? How about the sender's MUA configuration when the
message was composed?


No need to know.  Trace field definitions (RFCs 821, 2821) are
generally clear about what is supposed to appear in those trace
fields; Received fields document the client host name as specified
in the SMTP session HELO or EHLO command, the host name of
the host adding the field, a time stamp in standard format, and
optionally some additional information, most of which is well-defined
(and none of which is essential).  Return-Path documents the sender
envelope address.  No MTA-specific configuration alters or in any
way affects the client's argument to the HELO/EHLO command,
for example.


(*)
The same will be true for a Processed field. A Processed field shall
consist of keyword="value" elements, with a number of predefined keywords
whose appropriate values are explicitly defined. Nonstandard keyword="value"
elements will be allowed, but discouraged. A reader of the Processed field
shall only act on predefined keywords, or all keywords at its discretion
and its own risk. A well defined authentication based on quoting the latest
Received line at time of writing will be optional, but strongly encouraged.
Any reader of the Processed field will be strongly encouraged to only
read those Processd fields whose authentication can be verified, and 
whose associated Received line it trusts by other means. Whether it truly 
does so in the end is at its own discretion of course.

But you (or anybody) still routinely make decisions based on the
unreliable information written by the above programs.


No, trace fields are examined in rare cases, not routinely.
The specific information in those fields documents facts
about a particular SMTP session (client-supplied identification,
etc.); it does not purport to make a value judgment about
the message content.


Trace fields are routinely used by spam filters to help decide whether
to block the message or not. So unless a user's mail is not filtered
at any intermediate MTA location, a decision has already been made
based on this information, even before the message hits the user's
inbox.

For example, you 
decide to read messages addressed to you, you decide to reply to the
mailbox listed in a Reply-To fields etc. even though you have no idea
what configuration the software which created those fields was in.


Ultimately a human is responsible for setting recipient addresses
and Reply-To fields; they are set by a human user, either directly
or by configuration of an MUA acting on his behalf and for which
he bears responsibility (and for which no recipient needs any
configuration information).  In cases where Reply-To has been
(inappropriately) set by autonomous agents (e.g. misconfigured
list expanders), problems have resulted.


You're sidestepping the issue. I've given an example (Reply-To) where
you (and I and we all) act based on zero hard facts about the writer
of the Reply-To field. If we can act on unreliable fields such as
Reply-To, why can't we act on fields such as Processed? What's the
fundamental difference you perceive and are pointing out?


Your proposed "processed" field differs from all of the above;
unlike Return-Path and Received trace fields, its content is
not well-defined.


Of course not, you'll have to wait for a proper draft. I'm bouncing
off ideas for how best to deal with a sub-issue (proving writer
location), I thought that was clear. And incidentally I thank the list
for the valuable feedback.

Unlike Reply-To, it is not set by a human user.
Unlike all of the above, it purports to convey information,
including a value judgment, which is unusable without
additional detailed information which it does not convey.


I've given an outline of the structure of the Processed field in
the paragraph marked (*) above, which should address these points
except for the human element. I don't see why being a human rather
than a machine is significant.

-- 
Laird Breyer.