Response to the Bellovin Critique of SPF

On Sun, Jan 04, 2004 at 10:06:37PM -0500, Steve Bellovin wrote:
| I sent this note to Dave Farber's list.  I don't know if he'll publish 
| it, so I'm sending my comments directly to you.  This note is fully 
| public, and may be redistributed freely in its entirety.

Thank you for the detailed comments.  Where they describe problems with
the draft I have tried to correct those problems.  Where they
misunderstand the spec I have tried to improve the language.

| There are several major problems with the document as written,
| including its semantic model, the uptake model, and the "specsmanship".

Criticisms of the semantic model are not problems with the document as
written.  They are problems with the philosophy to which the document
subscribes.  Whether the Internet community chooses to adopt that
philosophy is a question best answered by showing them the document and
asking for experimental deployment.  All the proposals in the LMAP
family share the same semantic model.

Criticisms of the uptake model are not problems with the document as
written.  They are a marketing problem.  But I felt that a sales pitch
was outside the scope of a technical specification.

You're right about the specsmanship.  My next draft will be better :)

| The latter is easiest to fix, but it will render current implementations
| useless if the eventual spec is different.  Running code is a great
| way to test a concept; too much deployment of bad running code is
| a tremendous obstacle to a decent standard.  I'm not even talking
| of the perfect being the enemy of the good.

The latest round of editing has not changed the eventual spec.

If the specsmanship is poor, I can only offer the excuse that it was
written under time pressure.  I wasn't worried that the perfect would be
the enemy of the good.  I was worried that the proprietary would be the
enemy of the open.

| To make it easier to read, I've indented the major sections of this
| note:
| 
| Specsmanship:
|       The version number definition is problematic -- it only
|       has major version numbers.  I suspect that we need minor
|       version numbers as well, for operational debugging.

Major and minor version numbers are a convention.  As long as version
numbers are sufficiently distinguished, I think integers will do just as
well as reals.

|       The most glaring problem with SPF is the use of TXT records.
|       TXT records are supposed to be free-form text, with no
|       semantics attached.  The use of TXT for test purposes is
|       understandable (though regrettable -- an experimental record
|       type code would be better); the use of TXT records for
|       textual error messages is not.  The document itself notes
|       the problem of ordering of multi-record messages.  Beyond
|       that, there are problems with internationalization:  what
|       language should the error message be in, and in what
|       character set is it encoded?  A simple URI would be a better
|       solution; at the least, it should point to an SPFERR record.
|       (Record subtyping in the DNS causes problems; see RFC 3445
|       for some details on why.)

The goal of the SPF project is to achieve fast widespread adoption.  The
process of getting a new experimental record type was considered too
slow and too fraught with politics.  We chose not to fight that battle.

The exp= TXT record is intended to hold a short message, often one with
a URL.  That URL is set by the sender domain, which can perform the
appropriate internationalization.

The problems of internationalization in SMTP are not specific to SPF.
If an ESMTP extension defines a language, we will be happy to cooperate
with it.

|       The use of TXT-like records is problematic because it
|       requires parsing an ASCII string in a DNS resolver.  (Yes,
|       I know that NAPTR records require the same sort of parse.
|       I don't much like that, either.)  The more complex the
|       parse, the harder it is to get right, both for the author
|       and the receiver of such records.  A TLV-based structure
|       permits parsing by the author's DNS server, and is easier
|       to interpret on the receiving end.

This objection seems to be founded on an incorrect assumption.

The DNS resolver is not required to parse the ASCII string.  Neither is
the DNS server.  Only the SPF client library is required to parse the
ASCII string on behalf of the receiver system.  At least it isn't XML.

|       The Received-SPF header line is badly specified.  It doesn't
|       follow the the standards for other RFC 822/2822 headers
|       (i.e., it requires exactly one space in certain places
|       where an arbitrary amount of white space (including none)
|       is permitted in other headers); it has some things as
|       comments (receiving host) that should be parseable; and it
|       doesn't mandate that Received-SPF lines from outside of
|       the domain MUST be deleted.  (The actual requirements here
|       are more complex; I won't go into details in this note.)

The whitespace error has been fixed.  "SP" is now "*SP".  Thank you for
noticing this.

We discussed the need to delete preexisting "Received-SPF" lines and
concluded that only a very stupid spammer would forge a "Received-SPF:
fail" line.  Yes, they might forge a "Received-SPF: pass" line, but that
should be ignored by any self-respecting spam filter.  In any case, due
to the prepend rule, the nearest, trusted line should always be the
first one found.

If you don't have to manipulate the headers, you simplify processing.
An MTA can just prepend the results without having to analyze the
existing headers and rewrite them.

|       Yes, the line as specified is a bit easier to parse, but
|       any spam filter is going to have to deal with many other
|       headers, and hence will have to have a full-fledged 822/2822
|       parser.

We have discussed ways to improve the syntax of the line to make it
easier to parse.  Thank you for bringing this issue into the open.

|       Too many cases can result in an "unknown" return value.
|       That makes debugging hard.  There needs to be a "none"
|       value, for cases where there is no SPF record; there needs
|       to be a type code for "unknown", to distinguish among the
|       many error cases.  Beyond that, the set of type codes needs
|       to be enumerated -- as is, we'll see an operational nightmare.

The reference SPF library ships with a command line utility.  You can
enable a trace mode in that utility that shows exactly what reasoning
led to the "unknown" result.  I haven't heard anyone complain about
difficulty with debugging yet.

In the case of
  domain1   "v=spf1 redirect=domain2"
  domain2   NXDOMAIN
Should the result for domain1 be "none" or "unknown"?

"Unknown" is strictly defined: a client must proceed as though the
domain did not publish SPF records.  Usually, this means giving the
message the benefit of the doubt, even if that means additional content
filtering.

Distinguishing "none" from "unknown" encourages clients to treat
"unknown" as "fail".  If too many clients make this judgement, the
integrity of the projhect would be threatened.  That is why I felt it
was important to omit the "none" return code.  If it turns out I was
wrong, we can change this in spf v2.

|       Section 5 speaks of using Received: lines.  Such lines have
|       been forged by spammers for many years.  While they can be
|       used, great care must be taken.  This document needs to
|       define the necessary steps appropriately.

I felt that a tutorial on detecting header forgery was outside the scope
of a technical specification, and was likely to do more harm than good,
on the grounds that a little knowledge is a dangerous thing.  Those who
need to analyze Received headers know what the problems are.

|       5.1 speaks of cidr-lengths, but 5.2 et seq. speak of
|       dual-cidr-length.  That looks like something where the
|       editing hasn't caught up yet.  But having a CIDR length on
|       an MX record is a bad idea, since there may be multiple MX
|       records with different appropriate lengths.

This has been fixed.

The same objection applies to putting CIDR lengths on A records.  If
multiple MX records have different lengths, publishers should manually
specify the appropriate ranges.  Maybe this is too much rope.

|       The macro language scares me -- it's very complex.

The macro language tends to scare most people who encounter the spec for
the first time.  It turns out to be very useful and solves all sorts of
problems.  The people who implement it tend to become its strongest
defenders.

|       8.4 ruins much of the effectiveness of the scheme -- it
|       provides ways to avoid processing.  For example, a spam
|       engine could send email with a local-seeming HELO, MAIL
|       FROM, and From: entries, in which case (per Example 3) SPF
|       isn't to be used.  Spam from abuse@ or postmaster@ can also
|       bypass checks.

This objection seems to be founded on a misreading of the spec.

   Receiver systems SHOULD exclude special recipients such as
   postmaster@ and abuse@ from SPF processing.  See RFC2142 [13].

That's "recipients", not "senders".  If a spammer wants to spam
postmaster, that's his decision.  A spammer sending as postmaster
doesn't get any special treatment.

Here's what the spec says:

   SPF is only one component in a policy engine.  An SPF-conformant SMTP
   receiver is NOT REQUIRED to perform SPF tests on messages whose
   dispositions have already been decided on the basis of other policy.

     Example 1: if an SMTP receiver requires that sender domains must
     possess MX or A records, and rejects transactions where they do
     not, then SPF tests are moot.

     Example 2: if an SMTP receiver expects messages from a trusted
     client, such as a secondary MX for its own domain, then SPF tests
     are not needed.

     Example 3: if an SMTP receiver is considering a transaction which
     does not yield a fully-qualified domain name in either the MAIL
     FROM sender or the HELO command, SPF tests are not appropriate, and
     the disposition of the message should be decided on the basis of
     other policy.

I put that in because people were asking questions like "well, what if
you send mail to a local user?  What does SPF do then?"  This section
basically answers "mu".

|       The suggestion that this scheme become default in April,
|       2004 (Section 9.4) is preposterous.  Even if the IESG were
|       to approve this document today -- and very few documents
|       are passed on first try -- it would take far longer than
|       four months to build, test, and deploy production-grade
|       clients and servers.

Yeah, I took out the date.  Still, I wouldn't be surprised if we did see
wide deployment by that time.

No servers need to be built for SPF.  SPF reuses existing DNS
infrastructure.  Clients have been written in a number of languages.

Just as a point of interest, the SPF Python client library took six
hours to write.

|       The security considerations section mentions IP address
|       spoofing, though the FAQ claims that they aren't real.  I
|       agree that classical spoofing, per Morris' 1984 memo, is
|       probably not a major threat here.  But spammers are using
|       BGP to steal entire address blocks -- that's a bigger
|       threat.  (The FAQ also points to RFC 2761 when it should
|       be 2671.)

Maybe we could introduce an ESMTP extension that requires compliant
clients to echo a random string back to the server.

| Uptake Model:
|       As Rick Adams has pointed out, there is no consensus yet
|       that this is the right way to go.  The major ISPs on the
|       net -- AOL, Yahoo, MSN, etc. -- have not bought into this
|       scheme.  Unless and until they do, it doesn't help much,
|       either for their customers (who make up a substantial
|       proportion of the user population) or for everyone else
|       (since their addresses could be forged).

In the last month, 4500 domains have registered at the SPF registry.
Many more have published records but have not registered.  This is based
on word of mouth alone, with no organized publicity campaign.  If an
industry consortium gets behind it, I expect to see even better numbers.

Some of the biggest names on the early adopter list include:

  AOL.com
  AltaVista.com
  DynDNS.org
  LiveJournal.com
  OReilly.com
  Oxford.ac.uk
  PhilZimmermann.com
  Perl.org
  w3.org

and, amusingly,

  foo.com

| Semantic Model:
|       In a strong sense, the part that requires the most debate
|       is the semantic model.  SPF strongly binds a sender to some
|       DNS records.  But that isn't always a good idea.  People
|       who use portable email addresses will now be constrained
|       to use the domain owner's SMTP sender, which may not even
|       exist.  (A more interesting model would permit delegation
|       of individual user names to particular sending machines.
|       But that would probably require too much public key
|       cryptography to be affordable.)
| 
|       The net effect will be to bind users more strongly to their
|       ISPs and/or their employers.  While big ISPs may like that,
|       it flies in the face of current (American) public policy
|       -- witness local telephone number portability.  Ironically,
|       it will also discourage a current anti-spam strategy used
|       by many: throw-away email addresses for particular purposes.

The binding between email address and domain was an SMTP design decision
which I cannot defend.  I wasn't there at the time.  It's created an
entire industry of vanity domains.  A lot of people, sellers and buyers
both, seem to be happy with that solution.

I would submit that now isn't the best time to redesign SMTP for number
portability.  People seem to be more interested in a different problem.

I don't see how SPF discourages throw-away email addresses.  Can you
illustrate?

|       It will also make life harder for people who regularly use
|       multiple sending email addresses.  For reasons of privacy,
|       my children generally use email address that are not readily
|       tied to their real names.  But for certain very important
|       kinds of communication -- sending email to teachers, for
|       example -- they use a family-linked email address.

I don't think SPF will prevent them from doing anything they
presently do.

|       It isn't always clear to people what SMTP server they're
|       actually using.  Over the last few years, I've noticed that
|       one major hotel chain intercepts outbound SMTP connections.
|       I don't know if they're trying to defend against check-in
|       spammers or if they're trying to help travelers whose
|       laptops are hard-wired to point to their company's or home
|       ISP's SMTP servers.  Granted, people should use VPNs or
|       SMTP/SASL for such things; too may don't and perhaps can't
|       -- if your ISP doesn't support it, for example, you can't
|       use it, and switching ISPs carries a non-trivial cost,
|       especially if you have only one choice of broadband ISP.
| 
|       I've underscored some of my points here by using a portable
|       email address of my own, rather than my usual email address.

I can't see the original message now, but I would wager that while your
From: address was @acm.org, your Return-Path was 
smb(_at_)research(_dot_)att(_dot_)com(_dot_)

But you are right; some things will break.  If we examined message
headers instead, we would have a different set of problems.  For
example, a system that tries to authenticate the From: header will start
out promising to protect

  From: service(_at_)paypal(_dot_)com (Paypal Customer Service)

but end up vulnerable to an attack of the form

  From: bad(_at_)spammer(_dot_)com (service(_at_)paypa1(_dot_)com)

That is why SPF doesn't want to get anywhere near protecting the headers.

The best thing I can say about mailing from places that block port 25 is
this: in an ideal SPF world the need to block port 25 would go away.

|       The basic concept may or may not be a good idea.  The
|       authors themselves admit that it's only part of a total
|       anti-spam solution, and I'm not convinced that it's worth
|       the deployment effort.  Its strongest in dealing with "joe
|       jobs" -- spammers (and worms) impersonating real email
|       addresses -- but that's the part that most runs afoul of
|       my semantic concerns.

The major ISPs are convinced that some kind of sender authentication
scheme is worth the deployment effort.  Of the three alternatives on the
table, the SPF scheme requires the least deployment effort.

Because it alone promises to reduce the volume of joe-jobs, people have
been eager to adopt it, because it promises a direct benefit.

cheers
meng

-------
Sender Permitted From: http://spf.pobox.com/
Archives at http://archives.listbox.com/spf-discuss/current/
Latest draft at http://spf.pobox.com/draft-mengwong-spf-02.9.4.txt
To unsubscribe, change your address, or temporarily deactivate your 
subscription, 
please go to 
http://v2.listbox.com/member/?listname(_at_)©#«Mo\¯HÝÜîU;±¤Ö¤Íµø?¡