Re: Suggest New Mechanism Prefix NUMBER to Accelerate SPF Adoption


----- Original Message ----- 
From: "AccuSpam" <support(_at_)accuspam(_dot_)com>
To: <spf-discuss(_at_)v2(_dot_)listbox(_dot_)com>
Sent: Wednesday, August 25, 2004 5:54 PM
Subject: Re: [spf-discuss] Suggest New Mechanism Prefix NUMBER to Accelerate
SPF Adoption

At 03:41 PM 8/25/2004 -0400, Scott Kitterman wrote:

Before we plunge off on this tangent again, is there anyone out there
writing an SPF parser that would make any use of this "added

information"?


Hoping for a "yes" answer, because I think it is "must do even if I am

too

busy" because as you can see from previous post:


http://archives.listbox.com/spf-discuss(_at_)v2(_dot_)listbox(_dot_)com/200408/1072.html

That the large ISP has no choice but to set "?all" (useless setting),

because anything other than "-all" will get all kinds of incorrect
mathematical assumptions, which will lead to false positives.  If it

takes

years for ISPs to transistion to "-all" (100% confidence), then SPF is

held

back for years in useless state (for email from those ISPs).

This is not necessarily true. SpamAssassin 3.0, for example includes

scoring

the SPF fields. It lacks the computation and bandwidth benefits of

rejecting

the spam from the start, but it's handy nonetheless. It can be

particularly

used to offset the whitelist score for messages allegedly from your own
domain.



You missed my point.

My point is that if even some receivers are going to assume that "~all"

means xx% chance of forgery for ALL domains, where for some domains it
really is yy% and others it is zz% and others...etc.

Your point is fairly irrelevant. ?all is not a useless state: other factors
can and do get used with the evidence of a softfail message or a pattern of
them for tuning your local Bayesian filters, or submitting the IP addresses
or actual hostnames to a blacklist, or a reputation system.

then what happens is that the large ISP (who can not risk the false

positives from an incorrect probability assumption of the receiver) can not
risk setting "~all".

And this is a fairly silly since that is what the heck tuning nad other
filters are for. If using such filtering really, really does catch a lot of
the spam from a domain, especially virally submitted or zombie-submitted
spam from a big domain, they have every incentive to go to -all ASAP. By
leaving their filters at ~all, if they need a switchover stage, they allow
the opportunity to not actually block the mail short term but to allow the
mail to be individually scored or whitelisted. The filters and scores on the
receiving end have plenty of flexibility to allow the recipient to score
appropriately.

Go see my first post for example numbers of how the incorrect assumptions

could drastically cause false positives.


And ditto for "?all" if you will assign a percentage other than 0.5.

So then ISP is left only with "-all" or "+all" or do not do SPF.


And I can make up numbers not based on experience to produce fake results,
too.

Log analysis of the SMTP server can also reveal sites that have patterns

of

forging email and should be banned outright or is virus-laden and the
administrators should be contacted, or even reveal SPF violations of
outgoing email that should lead to contacting the on-site sender.



If you have 10% of the internet email, then yes you might be able to make

a reasonable approximation.  But any thing less than a statistical approach
(not bayesian content analysis) will not be meaningful on the aggregate.

??? OK, now you're wildly confused. Such spew is normally a big cluster of
spam or email worm messages sent in a very short order. Most of us simply do
not care if one in a dozen, or one in a thousand of messages from a single
IP are spam. We block it, or we let the spam filters score it and make the
correlations.

Yes you often can single out extreme cases with smaller sample (just look

at how standard deviation is defined), but this does not help you in terms
of the majority of cases.

The "extreme" cases *are* the majority of the cases. Take a look at a raw
mail log and analyze it with SPF before it gets blocked by a DNS blacklist.
You'll see that where SPF is in use, a softfail has at least visually, a
very good correlation with a spam source or virus source that would be
blocked from the other means and with other factors such as no reverse DNS
name, bad target email addresses, forged and non-existent hostnames, etc.

I, for one, would like you to shush for a week and learn something

instead



Why is there such resistance to someone else having a good idea?  Why

can't we just evaluate this on the merits?  Do you have any formal
background in probability theory?

MIT undergrad coursework, Professor Rota. I got an A. Considerable digital
signal analysis and systems work since then, although I tended to focus on
pathological cases where your systems are lying to you.

Please don't flame me.  I am not flaming you.


No, you're just blithering ill-founded analyses on a daily basis. Given your
original mischaracterizations of the nature of SPF, your mis-handling of
your own personally created anti-virus software dealing with this mailing
list, and your extremely odd ideas about how SMTP is normally administered,
I would frankly prefer a well-founded flame.

Take a week off and learn some things first.