Re: My notes from FTC Summit with statistics (was: Sendmail white paper

On Fri, 26 Nov 2004, David Woodhouse wrote:

Thanks for posting your notes. I'm not sure what conclusions to draw
from them -- I can't really see whether these data confirm or refute my
assumptions. Which are that the added CPU load will be partly offset by
the reduced CPU load needed for other things like spam and virus
checking on the mails which the crypto scheme allows you to reject, and
that the resultant CPU load increase is easily accommodated by the fact
that these machines weren't highly CPU-bound in the first place.

I was looking ideally for something like a comparison of CPU
utilisation. Watch the machine for a day with a real workload, while it
does spam and virus checking. Then take a copy of everything it received
and DK-sign it as appropriate (sign all valid mail, attempt to sign a
tiny proportion of the spam with an incorrect signature). Presumably
leave most of the spam unsigned. Then play it back to an identically
configured machine and watch the CPU load on that as it does all the DK
stuff too.


If I remember Sendmail also said that they ran their own tests and 
increased cpu load on any single server due to signing all emails was 
about 25%, I think this may have been with smaller 384 or 512 keys.
We should probably assume that good size keys (768 at least) would mean 
around 50% increase in cpu load (and to me that is acceptable number).

I do not think that signing and verifying will be upset by that you dont 
need to do virus and spam checking, at least not immediatly. I have to 
assume that spammers are probably not going to sign their emails with bad 
keys, but they may sign them with their own good keys and that means you 
still need to do allthe same virus and spam testing after the signature 
verification.

You also will not be able to reject email based on "I sign all my email" 
policy records for DomainKeys because DK signed emails will fail after
being processed by almost any mail list and that means people who want
to protect their domains from beeing spoofed really can not do it with DK.

But mail signatures are going to be good in the future for "hanging" 
accreditation on to them and for reputation systems (and they are a lot
better for it then SPF or SID which cant be trusted for true identity
verification) and that means in some far future you maybe able to use 
information that email was signed and verified and then go to third party 
that will tell you something about the party that signed the email and
after that you will not need to do additional spam tests. But it may take
quite some time before we come to this point and then see benefits from
automated mail signatures in reducing extra filtering cpu load.

-------------------------------------------------------------------

And getting back to SPF, I took some notes on the statistics that
were presented at well. 

1. Godaddy statistics (?)
 7% of emails to Godaddy have SPF records
 18% of emails are rejected based on SPF
 14% of SPF emails are from known spammers

2. Earthlink numbers 
 90% of emails that passes SPF is spam
 90% of emails that fails SPF is spam


So 10% of the mail that fails SPF is valid? That's a lot. Or does 'spam'
in this context not include certain classes of unwanted stuff, like
viruses?


Not sure, ask Earthlink (look at FTC website for panelist from earthlink, 
there was only one). My guess is that they meant that 10% of emails 
that failed SPF was not identified as spam by means of some other spam 
test system, but does not necessarily means it is not spam (i.e. not
every spam email is identifed as spam by automated means). 

I do know that Earthlink said they were not rejecting emails based on SPF
and that these tests were run only for portion of their emails for testing
purposes only. That is unlike GoDaddy who are actually rejecting emails 
based on -all spf records.

 40% of emails that does not publish is spam


What I'd really like to see in these statistics is the number of spams
mails which are rejected by SPF which _wouldn't_ have been rejected by
other means, vs. the number of false rejections, which is possible the
10% quoted above although as I said, that seems higher than I expected.


Yes that would be good to see but I would assume it requires human to
look at every one of the emails from these 10% (rejected only by spam)
and decide if it is spam or not.

-------------------------------------------------------------------
On separate note sheet I have the following from the 1st day:


I see the jesters are out in force again :)


They weren't they only ones, that guy just got me most heated up after 
first panel session (even more so then MS laywer on that panel).

No need to scan the paper of which you speak -- I think it's on their
web site.

Yep, Here it is:
http://www.actonline.org/documents/041109%20ACT%20Whitepaper%20on%20OSS%20and%20Open%20Standards%20FINAL.pdf

BTW - the point about them is that you cant assume who they are and what they
stand for based on the name, i.e. if you base it on what they seem to stand 
for you might need to call them "Association for Anti-Competitive Technology"
:)

The reason I mention it is because some here were taking about renaming SPF
into some "safe mail standards association" (I rephrased on purpose). Well:
 1. don't, you might be laughed at in the same way as with actonline
 2. its not about the name anyway, its about what we do and how

-- 
William Leibzon
Elan Networks
william(_at_)elan(_dot_)net

Re: My notes from FTC Summit with statistics (was: Sendmail white paper)