RE: RE: SPF: Not just a clever idea

From: Aredridel
Sent: Wednesday, June 09, 2004 10:00 AM


On Tue, 2004-06-08 at 19:54 -0500, Seth Goodman wrote:

From: Mark Shewmaker
Sent: Tuesday, June 08, 2004 7:12 PM


On Tue, 2004-06-08 at 18:28, administrator(_at_)yellowhead(_dot_)com 
wrote:


The SUBMITTER document was even more informative. It doesn't
say you have
to process the RFC 2822 From: (I am dead set against
processing anything

                                 ^^^^^^^^^^^^^^^^^^^^^

after DATA), although in early implementations you would

  ^^^^^^^^^^^

probably have to
just to get the information at times.


I do not understand *universal* objections to after-DATA
processing.  I
do not think it is entirely rational.


I want to second Mark's position on this.  A more reasonable
requirement is
to avoid doing _any_ expensive tests during the SMTP session,
whether before
or after DATA.  While we want to reject everything possible before DATA,
adding an inexpensive test after DATA that permits rejection at
the end of
DATA is worthwhile.  Expensive is a subjective term, so here's
what it means
to me:

- Fetching and parsing XML records appears to be incompatible with the
real-time requirements of the SMTP session.  IMHO, this is just as
undesirable during the DATA phase as before it.


With the appending and prepending of XML namespace attributes, the most
expensive part of XML is avoided being transmitted: It is merely
implied. That's a big boon, a big plus over the Caller-ID draft. The
remaining XML is a bit more verbose than SPF's syntax, though not really
much. It will only affect the cases near the edge anyway, which is
avoidable through a multiple lookup, just as in SPF. It just lowers the
bar a very few tens of characters.

XML really isn't that much of a burden. It's really surprising, and the
MARID improvements to Caller-ID have helped a lot. If you want to avoid
a full XML parser, a minimal subset for the spec will do a lot, so
that's not really an issue, either. It's about as bad as SPF in the
first place.


I guess that would surprise me.  Implementing a minimal subset of an XML
parser would deprive XML proponents of the extensibility that was the reason
for XML in the first place.  If you're going to limit the extensibility of
XML, is there really any advantage over SPF syntax for the purpose at hand?

- Requiring all 2822 checks to be expressed as an XML record is
tantamount
to saying they must be done after the end of DATA.


Well, 2822 has to be done after data by definition, though with a spec
like SUBMITTER, it can leak a very little.


While 2822 checks must obviously be done after the start of DATA, you can do
these checks either during the SMTP session or after it.  The demarcation
point is the end of DATA, when you still have the ability to reject.  I was
arguing that the expense of fetching the XML, running it through a _full_
XML parser, and executing a possibly complicated policy (due to the
extensibility of XML), would encourage the large system operators to delay
the 2822 checks until after the end of DATA.  If you detect a forgery after
DATA, you should probably null route it, since the return-path is untrusted.
If you send a DSN, you will be abusing innocent domains most of the time.
Both outcomes are undesirable.  This is bad direction to push email in,
IHMO.

- Any policy engine that operates after the end of DATA is
largely a waste
of effort.  Existing post-acceptance message filtering tools
are extremely
effective.  If we discover a forgery after accepting a message, we can't
send a DSN since the return-path is dubious so we are forced to
null route
the message.  Though this is clearly the lesser of two evils, it highly
undesirable.


Not really. Existing post-acceptance filtering is historically very bad
at detecting forgery, since there's no policy information to go on.


The Bayesian filters I use seem to do an excellent job of this, despite lack
of published domain policy.  The reason is likely that the forgeries are all
there to sell me something or to phish me.  Whatever the mechanism, a naive
Bayesian classifier can apparently tell the difference with surprising
accuracy.  The problem, of course, is that once you accept a message, you
are stuck with it.


There's spam filters, but SPF and SPF-ID were never intended to be an
anti-spam measure in any direct way.


That's absolutely correct.  However, if there were no spam and no phishing,
I doubt anyone would be here working on SPF.  SPF is a necessary, but not
sufficient, step to eradicate forgeries.  The motivation for stopping
forgeries is largely to make it harder to spam and phish.  By forcing
spammers and phishers to divulge their identities, we can fight them more
effectively.  We already have several excellent tools for strong sender
authentication, for cases where we _really_ need to know who sent the
message and/or if it was modified during transmission.

--

Seth Goodman