spf-discuss
[Top] [All Lists]

Re: Performance issues

2004-02-16 18:01:12

On Feb 16, 2004, at 12:30 PM, Hector Santos wrote:
That costs much much more. Validating a RCPT TO can include a long list
of things including checking existence, account status, quotas, etc.
Checking SPF records requires almost 0 CPU and our evidence shows that
the average check requires about 6 DNS requests... That is 6 UDP packets
our and 6 in.

 [ ... snip ... ]

Its not a matter of CPU cycles, but total transaction time that is important in a high scale system. The fast perform the transaction, the better it is
for scalarability.

Yes and no. The faster you "perform" the transaction, the better the scalability. The truth is that while the transaction may take 1, 10 or 60 seconds, you are really only "performing" for about 10ms -- the rest of the time you can be performing other tasks. The total transaction time is (almost) irrelevant. Let's analyze:

Regardless of the approach, the Disk I/O is constant -- you invariably need to write the message to disk and later delete it or decide not to accept it in the first place.

The network I/O is a latency issue not a saturation issue as a single 100Mbs connection can sustain around 24k DNS queries/second. So, we are talking much much less here. The latency is much more of an issue due to the occasional dropped packet and the more frequent "missing" DNS server. Assuming that your app doesn't sit on its thumbs while it waits, then it falls back to CPU resources.

Let's assume that a server handles 1 million message per hour. If each transaction takes 1 seconds, I need an average of 278 sessions open at a time. Obviously, if each took 1 full second of CPU time, we'd be in a whole world of hurt. But the fact is, they don't. If you do the match backwards (assuming a dual CPU system), we process 278 messages/second on 2 CPUs or 139 messages/second/cpu or 7ms of CPU time per message. Now, it's clear that 7ms is much less that the 1 second "transaction" to receive the mail. So while a lot of clever stuff happens in that time frame (SPF1, DNS RBL, RFC 2822 validation, mime parsing and validation) it is still pretty clear that what is _really_ happening is a whole lot of waiting.

To extend this further: Why do I care if my transactions take 1 second or 60 seconds? They are unlikely to take more than 7ms of CPU time each. The only ramification of 60 second transaction times is that you will have 59993ms of lull instead of 993ms -- resulting in 16680 concurrent SMTP sessions... That's well within the bounds of commodity hardware: http://www.kegel.com/c10k.html

To put all this in perspective, a million messages/hour taxes a storage system pretty heavily. Even with a nice storage setup, you would be hard pressed to sustain more than 4 or 5 million messages/hour throughput on a single Intel box. While you can't have unlimited concurrency on a system, you should be able to scale recent releases of Linux, FreeBSD and even Windows up in excess of 200k file descriptors (read: sessions). If we do the math here, we see 5 million message/hour and 200k open sessions gives us an maximum transaction duration of 144 seconds.

In my previous mail I said that transaction time was irrelevant assuming it is bounded and reasonable. 144 seconds is reasonable upper bound -- if you plan on pushing 5 million message / hour. Of course, if your throughput is less, your bound is higher.

SPF will catch more fraudulent spam than those RBLs, so
many of those RBL requests will disappear -- as the SPF test will fail
first.

That would be nice to believe this, but I personally doubt it. This is
only my opinion of course,  but SPF is not going to make other protocol
level methods obselete, and certainly SPF is not going to force the site operators that they better get another line of (life) "work." In then, in
my view,  SPF or LMAP based propoals is a short term kludge with a high
barrier of entry vs what it trying to accomplish/address - a problem with a dismishing half life. If the ultimate result is not presumed, then we are
just beating a dead horse. :-)

You misunderstood. SPF won't obsolete other mechanisms. My point was that SPF will catch a lot of traffic that would otherwise be hit by a DNS RBL. So if I have 1000 client connections all from unique IPs, sending mail from @aol.com (which is very common), I have a nice result. With DNS RBL first, I need to lookup these IP address.. With SPF, I need to lookup AOL once, then apply. If I do my SPF check first and used my cached AOL TXT record, I can refuse these connections with 0 DNS requests -- as opposed to two (one A and one TXT) for DNS RBL.

Anyway thanks for your input. I was just providing my own insight in our
LMAP implementation - "Things to consider."

I completely agree with your point on making SPF records as efficient as possible. AOL's record, for example, is quite quick to processes.

While pobox.com is much more expensive ;-)

// Theo Schlossnagle
// Principal Engineer -- http://www.omniti.com/~jesus/
// Postal Engine -- http://www.postalengine.com/
// Ecelerity: fastest MTA on earth


<Prev in Thread] Current Thread [Next in Thread>