spf-discuss
[Top] [All Lists]

RE: Suggest New Mechanism Prefix NUMBER to Accelerate SPF Adoption

2004-08-25 13:55:02
Stuart D. Gathman wrote:
BINGO!  And not all receivers have the same probabilities for a given
domain either.  And actually, bayes automatically tracks
probability for each
domain as well (since the domain appears in the Received-SPF header),
but that table would be way too big to post.

Unless you have the algorithm (it is possible) and the huge volume
of data to feed the algorithm (you will need a significant % of
internet email), then can not mathematically reliably determine the
probabilities for EACH domain. 

Wrong.  That is how bayesian filters work.  An empirical
measurement is
a lot more meaningful than some number made up on the spot.  All
you have to do is ensure that it gets fed meaningful tokens, and the
bayesian algorithm does the rest automatically - actually
measuring the
statistics instead of making them up.  Numbers supplied by the sender
would be worthless as tokens, unless quantified into a handful of
ranges (called, just for example,
NONE,PASS,FAIL,SOFTFAIL,NEUTRAL,UNKNOWN).

If I get a lot of PASS email from a SPF-enabled domain that ends up being 
classified as spam anyway, then I can calculate my own probabilities of (spam 
given domain, by domain) through my own receiving-server experience.

I could see major ISPs publishing these things
5% of SPF-verified email from free-viagra.example.com was classified as good
95% of SPF-verified email from major-software-vendor.example.com was classified 
as good
etc. where the %s are absolute %s on a black-white scale or weighted averages 
on a gray-scale.

So the numbers could be based on observed feedback rather than "made up."