ietf-mxcomp
[Top] [All Lists]

Re: Make CSV backwards compatible with legacy SPF records?

2004-09-29 22:53:33

On Tue, 2004-09-28 at 21:10, Matthew Elvey wrote:
On 9/23/2004 12:15 AM, Matthew Elvey sent forth electrons to convey:

Aha!  I found the post with the relevant stats. They're from March,
though.  Also, my recollection was off - domains using the ten most
popular SPF records included only 1/2 of the total domains with
records.
http://www.imc.org/ietf-mxcomp/mail-archive/msg00644.html  Something
reasonably recent would be great.

Anyway, let's consider the top ten (which I've reordered a bit) from
wayne:
<misformatted mess>

Arrgh - here's that without the rewrapping:

Columns are:
Popularity rank of such records; # of such records; record value.

1  1097 v=spf1 mx -all
3   463 v=spf1 a mx ptr -all
4   429 v=spf1 a mx -all
9   131 v=spf1 a mx ?all
6   306 v=spf1 a -all

The above would generally take 1-3 UDP DNS queries to resolve.

This would take a minimum of 2 DNS lookups and require a script parser
which entails its own risks. Your goal of being able to use existing SPF
records overlooks the very important point SPF does not identify the
administrator of the mail transfer agent.  These SPF records are only to
authorize the sending of messages for specific mailbox domains. These
SPF records may entail addresses beyond the control the administrator of
the mailbox domain.  It is pointless to make proclamations regarding the
significance of the "+", the range of a label, or the scope of CIDR
notation and the mailbox domain administrator's implied acceptance of
liabilities.  You are the one guessing what is implied by the addresses.
You are the one alleging this implied agreement in supplying these
addresses.  You are the one that will take the full brunt of being
wrong.  At best, it remains a guess what is being declared by the
addresses.

SPF does not do this better. SPF does not do this in a safer way.  SPF
does not do this in a more expedient way.  These examples of SPF records
do nothing that can not already be assumed and do nothing to overcome
the reason this approach failed with SMTP, and why there is an
exception.         

5   325 v=spf1 -all
7   171 v=spf1 +exists:CL.%{i}.FR.%{s}.HE.%{h}.null.spf.example.com -all
8   131 v=spf1 include:example.org ~all
10  130 v=spf1 ?all

These are equivalent to the null set, for the purposes of my proposal.

What is the point excluding an "include" but not a CIDR address?  What
is the point suggesting the shorthand "mx" and "a" notation offers any
additional benefit. 

2   804 v=spf1 ip4:a.b.c.d/32 ip4:a.b.c.d/32 a ptr mx -all

This is probably no longer in the top ten.

On Thu, 23 Sep 2004 03:39:03 -0700, "Douglas Otis" wrote:
On Thu, 2004-09-23 at 00:15, Matthew Elvey wrote:
<snip>

Huh?  The question is about the record given, which has none of these 
problems. (  foo.dom.    IN    TXT      "v=spf1 mx -all" )

There is a good overview here.
http://www.securityfocus.com/guest/17905
  
Yes, a good refresher.  Anyway it leads me to believe that the record 
only suffers from a problem that also afflicts CSV and that a good bind9 
or djbdns config can handle.

I would agree that someone making an effort could protect their DNS
servers, but the risks parsing script in DNS TXT records traipsing from
DNS server to DNS server does not compare to a single lookup of a binary
record.

I got that the first time you said it.
Matching to a fixed set of, say a dozen allowed strings does not make a
script. That would allow a large fraction of SPF records to be used
for the purposes of my proposal.

The translation of a set of textual constructs with specific syntax
constitutes script in my book.  Using a name list based upon the results
of CSV allows for both a safe and simple scheme that accomplishes the
same ends as SPF, but without having to express a single address.  No
addresses are entered for CSV, and no addresses would need to be entered
to define the mail channel for a mailbox domain using a name list. 
None.

There would be no need to daisy-chain new headers to connect MailFrom to
MailFrom either.  The Received headers logging the authenticated EHLO
information actually provides more reliable information without changing
a single element of SMTP.  All that is needed is CSV and a name list. 
This would allow anyone to know when the message was possibly being
spoofed without inviting spoofing and still allow mail to be accepted
from outside the mail channel. 

By limiting the mechanism to a single DNS lookup to a single DNS server,
the exposure is much less.  

Ok, it is a little less.  The risk is also low to start with. (Birthday 
attacks aren't common, AFAIK; I've never logged one.)  It's even lower 
to nonexistent given recent/patched/secure DNS software and systems 
(e.g. a split-split configuration).

This is entering a new territory where DNS is used to authorize
transactions with a cost incentive to defeat these protections.  Yes,
one could use carefully configured DNS servers to minimize this risk. 
This should not overlook the fact this risk was not necessary however.

No there are already reasonable scenarios where there are cost
incentives to defeat these protections.
Phishing, for example - if http://www.marktwainbank.com points to a
phisher's site.
 
For what it is worth, SPF "requires" more than a single record be read
perhaps from many different DNS servers.  This opens a sizable can of
worms.  

Clearly, I need to better explain my scheme, as it still seems unclear 
to you that what SPF "requires" isn't relevant to my proposal!
Please, can you give up using this as an opportunity to bash SPF?  It's 
OT. (Note the changed Subject:).

Using SPF records is not relevant to what is required of SPF?

If SPF was limited to CIDR notation and the MX record, then the value of
SPF would increase significantly as a white-listing tool.  Much of the
label construction macros and exists syntax reduce SPF's value and will
likely serve to obfuscate a ploy that hides zombies.

So therefore you'd like to not use useful information that might be
found in SPF records, just because it might be seen as legitimizing the 
components of SFP records that you feel are unhelpful to the effort.

You obviously do not understand the significant risk when making these
types of guesses or alleging acceptance of accountability. 

Example: Mailbox.tld has SPF records that define the entire address
space for three of their providers where they outsource some of their
mail, and their own.  I contend reputation should be based upon what you
have control over.  That would mean it would not be to your advantage to
include the machines of these other providers in a reputation
assessment.  
  
Ah, I did say ip4 (and therefore CIDR ranges) might be OK.
(I did suggest subsequently in mail to John Leslie that perhaps just 
+ip4:a.b.c.d/32 might be OK, note the /32!)
Perhaps even that isn't a good idea, as

mailbox.dom spf   ip4:[isp1's smtp]/32,ip4:[isp2's smtp]/32,ip4:[isp3's 
smtp] -all

enables me to inappropriately stake my domain's reputation on the 
ability of ISPs 1-3 to keep their smtp servers from being used to spam.
HOWEVER:  It's already the case that such SPF records are 
inappropriate.  If I don't have high confidence in the ISPs, then the 
appropriate record would be:

mailbox.dom spf   ?ip4:[isp1's smtp]/32, ?ip4:[isp2's smtp]/32, 
?ip4:[isp3's smtp] -all
and this record would be equivalent to the null set in my scheme:
my proposal ignores ~anything and ?anything.
Similarly, anyone creating a record like 

mailbox.dom spf cidr isp1, cidr isp2, cidr isp3, cidr for self.

I [DO NOT] think the motive for using SPF
records for this purpose is well founded.  If the MTA is not willing to
provide a significantly strong identifier, it does not make sense to
start guessing what is meant by the SPF records.

  
I argue that it does make sense to do so, and can be done safely.

What happens when the scheme can not be defended beyond being used to
sort mail?  I doubt any reputation service would last very long making
these guesses and allegations.

The fall back would be using the address.
  
The IP address?  Sure, that's an option.  (I note that RBL entries in a 
sense are MARID - they indicate MAPS' opinion as to whether a MTAs 
should be Authorized by MAPS' users to transmit mail to them, expressed 
as Records In the Dns.)

The mechanism used for the reputation service does not need to be based
upon DNS.  But the identity does need to be as strongly authenticated as
the IP address.

There can be no implied contract where a domain becomes accountable
for the actions of others trusted to send their mail.

<Insert previously provided argument that this is not what I'm asserting...>

Ah, but it is what you are asserting. : ) 

A provider lax in their security should not be allowed to hide behind
their customer's domains.  Should the provider wish to force the
customer to accept this risk, they can ask to have the customer setup a
CSV-CSA record that enables their servers.  You do not want this
accountability to be implied as you suggest.

I DO NOT SUGGEST THIS. WTF?  How many times do I have to say it?

You can attempt to reduce the scope of the SPF records by tossing out
those elements that appear to be under different administrators.  How
does this help?  When you create a "maybe" condition resulting from a
culling of the information, who is at fault either way?  This scheme
becomes damned by the gray.  CSV is purely black and white.  Why add
uncertainty and risk to a mechanism?
 
[mailbox.dom] will suffer the consequences of doing something so dumb,
if one of the 3 ISPs seriously fails keep their smtp servers from
being used to spam using mailbox.dom. The domain will get blacklisted
(in other words, its reputation will fall), and the spoofing will
stop.

SPF requires the addresses of the entire mail channel.  It is odd that
you would call that dumb.

  
No, you still don't get it.  You ignore or don't understand the 
difference between + and ? in an SPF1 record, as defined in the I-D / 
explained a couple paragraphs down.

Once again, I think this would be dumb:

mailbox.dom spf   ip4:[isp1's smtp]/32,ip4:[isp2'ssmtp]/32,ip4:[isp3's 
smtp] -all

not this:
mailbox.dom spf   ?ip4:[isp1's smtp]/32, ?ip4:[isp2's smtp]/32,?ip4:[isp3's 
smtp] -all

If one of the goals is to still allow mail to enter outside the mail
channel, why would that be dumb?  Some would rather trade reliable
delivery and a signature for this "must fit" approach.  At least using a
name list still allows a safe way to note when a message is outside the
mail channel without inviting a world of spoofers taking a ride on your
reputation.

The rules that accept only closed SPF components still do not achieve a
goal of assessing only the administrator accountable.
  
The rules I am refining do achieve that goal. Arguably, they do so 
better than CSV, because they'll work for most domains that haven't 
bothered to set up a CSV record.  Heck, using the SPF 'best guess' may 
be appropriate where there's no SPF or CSV record - the A&R system will 
reflect reality if this guess allows much spam through.

I would hope it never comes to groping through various TXT script
records looking for a clue.  I understand what you are attempting, but
there are risks getting this wrong.  It is not worth the risk.

Let's agree to disagree and move on.  (Or at least let's move this 
thread to
<http://mipassoc.org/ietf-clear/>) - Please reply there; I won't post 
further here on this thread.)

<snip>

Just removing open elements will not achieve the desired goal as it does
not provide the same set.
  
That's not a logical argument.  The ultimate desired goal is to address 
the spam problem.  It may or may not be material to achievement of that 
goal to get "the same set".

There are significant liabilities getting this wrong.    

Well, duh.  That statement is true whenever one is doing something 
important.  So we proceed thoughtfully and get it right.

 
SPF does not have the concept of administrative versus authorized. 

I would say it does. I would say that ? is for servers that are
tentatively authorized, and + is for servers that are administered by
the entity owning the domain.  + normally indicates a server that the
domain claims is administered responsibly; ? indicates a server that
it isn't sure is administered responsibly, but never [the] less does
not want to deny the ability to use the domain.

You are making assumptions how this syntax can be applied. Obviously the
manner of application is not under the control of the sender.  The
sender will be obliged to use whatever it takes to achieve their goal of
having mail accepted.  I do not see how the recipient can infer levels
of accountability for the mailbox domain administrator.  Do they get a
free pass when they declare their SPF records with a different domain
name than the mailbox domain, or in the case of CSV, a different HELO
name?  Reusing SPF records allows EHLO spoofing to use these records in
unintended ways.  

There is nothing in the syntax that would allow the differentiation.

See above.

In addition, the EHLO name can express any label to select any
record.
  
a record for aol.com is still differentiable from one for 
dhcp234.234.dialup.aol.com!

But either label may appear within the EHLO response.  You want a label
that can only be used for the single purpose of indicating the
administration of the MTA.  From this information, it is a small matter
to establish the mailbox-domain/mail-channel association.

For many domains this info isn't available.  And until CSV takes off, it 
makes sense to use the available info in a responsible manner as I 
propose.  Especially if the pressure to publish SPF records continues to 
increase.

Once an authenticated HELO name is made visible to the recipient as a
type of postmark, there will be plenty of motivation to publish these
CSV records to earn greater trust.  Name based reputation will also
enable significant protection from filtering.  All of this while making
fewer DNS record lookups. : )

-Doug