spf-discuss
[Top] [All Lists]

Suggest New Mechanism Prefix NUMBER to Accelerate SPF Adoption

2004-08-25 11:32:16
I am going to start a new thread, because IMHO this is too important for SPF's 
success.

I am proposing that adding the option to specify the probability that a message 
could be a forgery would:

(1)  Make SPF more useful to receivers because anti-spam algorithms (e.g. Spam 
Assassin) can combine probabilities from other evidence in the e-mail using a 
mathematical algorithm and arrive at a much more accurate answer, than if Spam 
Assassin has to assign an arbitrary probability that is same for all domains.

(2) Increase adoption by senders (especially large ISPs) because it would give 
them more certainty about how their SPF record would be handled in terms of 
avoiding false positives and decreasing false negatives.  As it is now, I doubt 
any large ISP would risk setting "~all", because it is a fairly ambiguous 
result.  For example, "~all" could either be treated by a receiver 
("implementor") as "95%" chance of forgery or "5%" chance of forgery.  For an 
ISP that has 95% compliance, setting "-all" would result in 5% false positives, 
which is unacceptable.  Setting "~all", could be interpreted by some receivers 
as "95%" chance of forgery, thus 95% false positives.  So the ISP has to set 
"?all", which tells the recipient "I don't know, so do nothing" and this makes 
SPF less useful to both senders and receivers and thus retards the adoption 
rate of SPF!!!!!!!

A possible syntax could be:

"-all0.9993" which would mean 7 in 10,000 chance not a forgery, 9993 in 10,000 
chance is a forgery.

I will respond to previous discussion below to further address why I think this 
is very, very useful and important.



Thanks but I wrote "probability" not "probably".

I mean an actual number between 0 and 1.

The owner of the domain, may have the best information about what
this number should be.

It would be much easier for anti-spam to combine that probability
with other factors if the probability were estimated by the owner
of the domain, since each domain will be different depending on
several factors, including demographics, type of business, rate
of SMTP AUTH adoption, etc...

Please consider this seriously.  I think this will have a very
big effect on the rate of SPF adoption!

Shelby,

Search the archives for the word "softpass".  I suspect that may be
instructive as to the response this proposal will get.

Scott Kitterman


I have found several posts, such as:

http://archives.listbox.com/spf-discuss(_at_)v2(_dot_)listbox(_dot_)com/200406/1197.html

My response is that I am not proposing a "softpass", so the criticisms in those 
posts are not relevant.  "softpass" is I agree semantically similar to 
"softfail", except perhaps a "softpass" would be on the other side or 50% from 
"softfail".  Please read the examples I wrote in #2 above to see why what I am 
proposing is not all similar to "softpass".

However, there are only two valid reasons to add another possible spf 
result IMHO: 
1. A more finely-grained answer would be useful to recipients. 


See #1 above.


2. The existence of a more finely-grained answer helps convince 
senders to publish SPF records. 

See #2 above.


http://archives.listbox.com/spf-discuss(_at_)v2(_dot_)listbox(_dot_)com/200406/1298.html


And if I 
look into the new draft-ietf-marid-core-01.txt there are now 
seven values "none", "pass", "fail", "softFail", "neutral", 
"transientError", "hardError". 
That's confusing, maybe it's an overspecification. Why not 
simply "error", and let the implementation decide what to do 
with this situation ? 

Because the implementor can not possibly know what rate of adoption the ISP for 
the domain has.  See examples in #2 above as to why this is really important to 
do.


http://archives.listbox.com/spf-discuss(_at_)v2(_dot_)listbox(_dot_)com/200406/1205.html

Written June 25, 2004
The more relevant fraction to look at is the fraction of 
incoming-mail-domains that either include a neutral result or have no 
spf record--and that percentage should continue to fall over time. 

It started at 100% and is only dropping

The # of large ISPs returning a useful result (i.e. not "?all") is dropping so 
fast?  How many large ISPs since June 25 have published not "?all"??????????  I 
think actually very slow is our problem.



I think what I am proposing is in essense what the original poster "Mark 
Shewmaker" of the idea for "softpass" needed.  He just asked for the wrong 
thing.  You need the probability # (a number between 0 and 1) to make it useful.

If anyone wants to review the math for why this is very useful to receivers, I 
provide one method on the bottom of this post:

http://forum.icann.org/lists/stld-rfp-mail/msg00061.html

P(a @ b) = P(a) * P(b) / [P(a) * P(b) + (1 - P(a)) * (1 - P(b))]

The derivation is:

http://www.mathpages.com/home/kmath267.htm