spf-discuss
[Top] [All Lists]

possibilities for 2822 (was SPF implementations)

2005-08-17 11:19:12
From: Frank Ellermann [mailto:nobody(_at_)xyzzy(_dot_)claranet(_dot_)de]
Sent: Monday, August 15, 2005 7:51 PM


Seth Goodman wrote:

RFC2821 and RFC2822 do define and use those keywords

Yes, 2821 > 2119.
STD 10 is 821, 821 < 2119, and STD 3 is 1123, 1123 < 2119.

STD 3 contains another RfC, but we're only interested in
1123 as far as SMTP is concerned.

Together, they replace 821, 822, 974 and 1869 plus update
and clarify 1035 and 1123.

Propose to replace, both are at PS, not at full standard.

All good points.  I keep forgetting that 2821/2822 are not yet full
standards, but you are still better off following those than 821/822.  As an
example, source routes really are deprecated, even if 2821 is only a
proposed standard.  That's not all, as you probably know better than me.


<...>

[RFC2821, section 4.4, Trace Information]
   It is possible for the mailbox in the return path to be different
   from the actual sender's mailbox, for example, if error responses
   are to be delivered to a special error handling mailbox rather
   than to the message sender.

Yes, we discussed this,  Note "other mailbox", as long as it's
in the same "MON" (mail originating network, a term created by
Keith) as the "originator" I've no problem with it.  Only if
it's elsewhere we're in the known 2821-deep-shit.

the standard clearly allows that.

The _proposed_ standard.  Otherwise we're back to 1123.  They
screwed up royally, everybody sees it in his mailbox or his log
files.

They _really_ need to straighten this out and use a consistent meaning for
MAIL FROM throughout the documents.  The new proposed standards don't seem
to make progress in that regard, but may be with your involvement it will
get better.  This has been a source of endless and useless bickering for
years.



I think we can safely say that there is never a valid reason
for the domain to be different in the return-path and From:
headers.

No, when I sent MAIL FROM:<####(_at_)hamburg(_dot_)de> I still used my
normal 2822-From: nobody(_at_)xyzzy  What I do in the mail (DATA)
is nobody's business but mine (and of the receiver).

And of course error reports went to ####(_at_)hamburg(_dot_)de because
that was the route I used.

Well, I partially agree, but I think there ought to be more of an enforced
relationship between MAIL FROM and _something_ in the 2822 headers than
presently exists in common practice.  Exactly what is debatable, but see my
comments below.



if mail from A does not make it to B, then the very last you
need to analyze this problem between A and B is a third
system MAIL FROM C.

Agreed.  This is totally screwed up, but permitted.

Okay, then set A = ####(_at_)hamburg, B = dont.care, I can then
still use 2822-From: nobody(_at_)xyzzy  The point is that I really
submitted at the MSA of A.  Then it's okay.

But if I'd use MAIL FROM xyzzy submitted at the MSA of A it
would match the 2822-From, but it won't work for several
reasons:

- the MSA of A won't let me use an unknown MAIL FROM
- my own xyzzy sender policy would FAIL for all IPs of A
- it's stupid if I'm interested in errors between A and B
- it's not what STD 10 intended, it's "bounces to" but no
  "return path" / "reverse path" / MAIL FROM / originator

<...>

Please don't say "you" and "Sender-ID drafts" in the same
sentence, thank you very much.

Okay, I thought you started to "reinvent" SenderID with some
minor twists.  IMHO they did the best they could do with the
2822 header fields.  It's not good enough, but I also see
no way to make it better.  Not below DKIM or William's ideas.

I'm glad you understand that I am _not_, in any way, sense, stretch of the
imagination, manner of thinking, remotely considering, possibly toying with
incorporating any of the ideas or any other language only a lawyer could
love, trying to breath any life into that dead dog known as SID.  I won't
bother preaching to the choir on this.

What I am saying is what we all have known since SPF stepped in to fill the
void created by the non-adoption of RMX.  SPF stops domain forgery of MAIL
FROM (at least when combined with some mechanism to permit forwarding, and I
like SES best for that).  2822 forgery is still possible and facilitates
fraud and phishing.  That weas out of scope for SPF, but it doesn't mean we
can't think about it.  This discussion is about how we might extend SPF most
efficiently to address this.

The primary solutions proposed to solve this problem are (in alphabetical
order to avoid bias):

1) DKIM

2) GPG

3) Metasignatures

4) SES

5) S/MIME

The predominant view seems to be something like this:

   "This is out of scope for SPF.  SID tried and failed to
   solve this problem in a manner analogous to SPF.  GPG is
   not likely to be adopted by ordinary users.  The S/MIME
   model is based on  purchasing a certificate from a trusted
   commercial CA.  Who ever heard of metasignatures or SES,
   anyway?  It looks to me like DKIM will solve this problem,
   at least that's what the smart people working on it say,
   so I don't need to worry about this problem."

Well, we can take that view, but it comes with a high price tag.  For those
among us who have not yet looked it over, DKIM is an RSA signature scheme
that uses DNS to distribute signatures, or possibly digests of the
signatures that are included in the message body.  Despite what the vocal
(to put it mildly) proponents of this technology say, this puts an
_enormous_ computational load on the recipient.  It is inherently prone to
replay, and combined with the high computational load, make it an
irresistible DDoS vehicle as the amplification factor is enormous.  The
latter fact was only recently realized by these "smart people", even though
they were apprised of this over a year ago.  Make no mistake:  DKIM is a
juggernaut similar in trajectory to SID.  It is there because a number of
people decided that this was the solution before seriously considering the
problem.  Exactly like SID, they wave off any problems that others discover
and bully them until they shut up.  As if that weren't enough, there is now
an IPR issue that Yahoo has raised.  Deja vue?

The amount of effort that a recipient has to deploy to validate all these
RSA signatures and put up with replay attacks should cause a rational person
ask the question, "isn't there a simpler way to do this?"  Fortunately,
there are several, but there is a lot of politics between the better
technical solutions and adoption.  I suggest that unless we start looking at
this soon, DKIM will be deployed and we will be stuck with an undesirable
solution.

First of all, let's look at what problem we are trying to solve.  We can
validate the MAIL FROM domain using SPF.  Despite the issues of softfail,
broken SPF records, etc., SPF (when combined with a suitable forwarding
mechanism) can unequivocally authenticate the MAIL FROM domain.  This is
only limited by the extent that providers permit cross-customer forgery,
provide the ability to remotely submit messages to your domain MSA and
recipients' willingness to reject on SPF failures.  Overall, we tend to
believe this will come to pass.

The nature of SPF authentication is ephemeral.  That is, the SPF result is
valid only at the time of message transmission.  The longer you wait from
that instant to the time you check the SPF result, the less sure you are of
its veracity.  Sending arrangements change and so do SPF records.  What was
valid last week may not be this week.  In this respect, it is completely
different from persistent signature schemes like GPG, where signatures can
be revoked for future use but past ones cannot be repudiated.  This is not a
problem.  We don't need a persistent signature scheme to validate ordinary
email.  Validating it once upon reception is sufficient.

Similarly, does the 2822 header validation have to survive remailing, as
with end user inline forwards or mailing lists?  I propose that the answer
is no.  Remailing means the message has a new originator.  If you know the
new originator, you can trust that they authenticated the message when they
received it (or know that they didn't bother; either way, you know).  You
can't go any farther back than that, but that is good enough for ordinary
email.  If you can accept this limitation, the authentication mechanism can
be a lot simpler than DKIM.

If you really need authentication that survives multiple remailing events,
you are getting into the problem space that GPG and S/MIME solve extremely
well.  This is also a tiny percentage of actual email traffic, so a simpler
solution is preferable for the majority of the traffic.  We also need a
method that will allow rejection of many forgeries before DATA, and with a
minimum of overhead.  DKIM can only reject a message as forged after you
have received the entire message body and validated the signature.  I think
we can do much better.

Now let's look at where we are once the MAIL FROM domain generates an SPF
pass.  The next step in the authentication process is to allow the SMTP
client to enter the DATA phase and examine the 2822 headers.  That is, in
general, much harder, since there are potentially many headers.  Is it
really necessary to authenticate all the originator headers?

To answer this, we have to go back and examine what MAIL FROM really means.
It apparently has more than one meaning.  One thing we can hopefully all
agree on is that it is the address that error notices concerning delivery go
to.  In this sense, it means "bounces-to".  However, it is also, and I would
argue foremost, the originator address, and it MUST be a single address.  To
take common practice of mailing lists and a few other uses into account,
let's relax this a bit and say that that it at least contains the domain of
the originator.

Some people may still adhere to the strict "bounces-to" definition, but that
clashes with the assumptions behind SPF and lots of sections of the RFC's.
SPF clearly assumes that MAIL FROM contains the "originator" domain.  The
newer proposed standards go back and forth between the two meanings for this
address, making it pretty clear it has both functions.

By "originator", I don't mean original author, but the single party
responsible for injecting the message into the message stream.  That could
be a remailer, as in a mailing list or an end user that forwards a message
that was delivered to their MUA.  Hopefully, this view doesn't cause too
many arguments.  I will assume from here on that MAIL FROM means both
"bounces-to" and "originator".

Next, we have to decide what the 2822 From: really means.  It is often the
originator, but not always.  People can put various identities in that
header and it doesn't have to correspond with the originator address.
However, if it is anything other than the originator address, it would be
forgery were it not for the existence of the Sender: header, which is
designed explicitly for this case.  Sender: is used when the originator (in
the same sense as the originator meaning of MAIL FROM) is not listed in
From: (also when there is more than one From: address, but that rarely
happens).  We expect mailing lists to use this convention, we ask greeting
card companies to use this convention, so why don't we expect individuals to
abide by the same rules?  Probably because the most common MUA's don't
support it, but there are ways around this.  For the moment, I will ignore
the Resent-*: headers, as they are infrequently used (even though they are
still in all the standards and proposed standards).

If an originator always put their address either in From:, or if they want
to use a foreign From: domain, in Sender:, and please hold up the bellowing
until you think about this for a few moments, authenticating the MAIL FROM
would automatically authenticate the 2822 headers.  This would have a
tremendous number of advantages.

1) MS MUA's, the most ubiquitous in use, _do_ show Sender: if you view the
message.  This would prevent someone from creating a usable phish with one
MAIL FROM address and a completely different From: address.

2) The SPF check is the only authentication needed.

3) You could reject, or accept and prominently flag, any message where the
MAIL FROM passes SPF but that domain does not appear properly in From: or
Sender: as a likely forgery/phish.

Now, what do we give up by requiring the originator address to appear in
either From: or Sender:?  Not much, IMHO.  What are the actual use cases
where this would cause problems?  I think there are precious few, if any.
If we can think of any, then _only_ those messages need further 2822
authentication.  Even these few exceptions, if there are any, can be handled
in a more lightweight manner than DKIM.  Let me repeat Frank's example from
above.

No, when I sent MAIL FROM:<####(_at_)hamburg(_dot_)de> I still used my
normal 2822-From: nobody(_at_)xyzzy  What I do in the mail (DATA)
is nobody's business but mine (and of the receiver).

And of course error reports went to ####(_at_)hamburg(_dot_)de because
that was the route I used.

Frank submits outgoing mail to the hamburg.de MSA and wants bounces to come
back to the hamburg.de MX.  He would like to use various From: addresses at
domain xyzzy.  Presumably, he sets Reply-to: so that people can reply to any
address that he designates.  There is no restriction on Reply-to:.  Since
Frank, or more correctly, hamburg.de, is the originator of the message, the
standards say that he should put this address in a Sender: header because
xyzzy is not the originator.  In effect, hamburg.de is sending a message on
behalf of xyzzy and it should say so.  Does Frank lose anything by putting
<####(_at_)hamburg(_dot_)de> (or <abcd(_at_)hamburg(_dot_)de>) as Sender:?  The 
only thing I can
think of is the partial anonymity he created with his From: address.  The
recipient still can see the Return-path: header right at the top of the
list, so the secret is already out.

What if he submits this message to a mailing list?  Let's say the mailing
list does an SPF check on incoming mail.  The MAIL FROM passes SPF, so the
mailing list MX permits hamburg.de to go in to the DATA phase and prepends a
Received-SPF: header.  Upon seeing hamburg.de in the Sender: header, the
mailing list MX has just authenticated that the 2822 information is not
forged.  The mailing list MX accepts the message at the end of DATA puts
this 2822 SPF pass in another Received-SPF: header (or adds to the existing
one).

When the mailing list distributes Frank's post to the list, it actually
should use the Resent-From: header, since there is already a Sender: header.
That would be compliant with the standards, but not how mailing lists
generally work.  If mailing list software stays exactly how it is today, the
list would keep the From: and put the list owner address in Sender:,
overwriting Frank's originator address (it might add another Sender: header,
I'm not sure).

Assuming the mailing list software overwrites Frank's Sender: header with
their own address, all list recipients will receive a message with MAIL
FROM:<VERP-stuff(_at_)list(_dot_)org>, which passes SPF.  Their MX then allows 
the list
MTA to go on to DATA and prepends a Received-SPF: header.  When the MX sees
Sender:<list-owner(_at_)list(_dot_)com>, it considers the 2822 headers 
authenticated
and adds another Received-SPF: header (or adds to the existing one).  The
end user sees Frank's anonymized From: address and hopefully the Sender:
address from the list.  If they are using MS MUA's, they will see
"From:<list-owner(_at_)list(_dot_)com> on behalf of <nobody(_at_)xyzzy>".  
Regardless of the
MUA, the end user can always look at the full headers and see that the
message was submitted to the list by <####(_at_)hamburg(_dot_)de>.

Even if Frank put in a non-functional address in From:, the recipient knows
that the mailing list did an SPF check when Frank sent the post and both the
MAIL FROM and 2822 headers passed.  They know the mail was a valid
submission from a domain and not a forgery.  However, they don't know what
that original domain was unless they examine the full headers.

This is unfortunate, because if Frank were of the mind to do so, he could
have put <CEO(_at_)BankOfAmerica(_dot_)com> as the From: address.  One solution 
to that
is to have mailing lists leave Sender: headers alone and add Resent-From:
when there is a Sender:.  MUA's would then have to display Sender: plus all
Resent-*: headers.  In this case, an MUA might display this in MS style as,
"From:<list-owner(_at_)list(_dot_)com> on behalf of 
<####(_at_)hamburg(_dot_)de> on behalf of
<CEO(_at_)BankOfAmerica(_dot_)com>".  This is a bit tedious to read, but it 
shows you
exactly what happened and defeats Frank's attempt to achieve financial
security.  Though this would effectively prevent phishing, I'm afraid this
has between slim and no chance as it requires both mailing lists to change
their use of headers and MUA's to change their UI's.  It's too bad because
it is all standards compliant, but there are too many adoption hurdles.

Of course, if nobody used the Resent-*: headers that most MUA's don't
support and fewer display by default, and simply rewrote the Sender: header,
we would still always know the party responsible for the present message.
That may well be good enough.  Frank has already pointed out that he
considers the whole Resent-*: set of headers is essentially FUBAR, though I
think they make sense in a formal way.  Another advantage of requiring the
MAIL FROM domain to appear in either From: or Sender: is that it doesn't
trample on any MS IPR.  We are not using the complex chain of originator
headers they specify, nor are we determining a PRA from them, nor are we
fetching a record from DNS to validate a PRA.  We have already validated the
MAIL FROM domain using SPF.  All we are adding is the requirement that the
validated MAIL FROM domain appear in either From: or Sender:.  I think this
might fly.

The only things we've lost by insisting that the From: or Sender: domain
matches the MAIL FROM domain is one anonymizing mechanism and a clear chain
of remailing events in the rare case that intervening MUA's or MTA's
actually support this.  We still have the Received: headers, so the path
information is not lost.  In a time when we have to authenticate the origin
of messages in order to assign responsibility for abusive behavior,
trivially achieved anonymity may be a necessary casualty.

--

Seth Goodman