Re: Performance issues


On Feb 16, 2004, at 12:30 PM, Hector Santos wrote:

That costs much much more. Validating a RCPT TO can include a longlist
of things including checking existence, account status, quotas, etc.
Checking SPF records requires almost 0 CPU and our evidence shows that
the average check requires about 6 DNS requests... That is 6 UDPpackets
our and 6 in.
 [ ... snip ... ]
Its not a matter of CPU cycles, but total transaction time that isimportantin a high scale system. The fast perform the transaction, the betterit is
for scalarability.

Yes and no. The faster you "perform" the transaction, the better thescalability. The truth is that while the transaction may take 1, 10 or60 seconds, you are really only "performing" for about 10ms -- the restof the time you can be performing other tasks. The total transactiontime is (almost) irrelevant. Let's analyze:

Regardless of the approach, the Disk I/O is constant -- you invariablyneed to write the message to disk and later delete it or decide not toaccept it in the first place.

The network I/O is a latency issue not a saturation issue as a single100Mbs connection can sustain around 24k DNS queries/second. So, weare talking much much less here. The latency is much more of an issuedue to the occasional dropped packet and the more frequent "missing"DNS server. Assuming that your app doesn't sit on its thumbs while itwaits, then it falls back to CPU resources.

Let's assume that a server handles 1 million message per hour. If eachtransaction takes 1 seconds, I need an average of 278 sessions open ata time. Obviously, if each took 1 full second of CPU time, we'd be ina whole world of hurt. But the fact is, they don't. If you do thematch backwards (assuming a dual CPU system), we process 278messages/second on 2 CPUs or 139 messages/second/cpu or 7ms of CPU timeper message. Now, it's clear that 7ms is much less that the 1 second"transaction" to receive the mail. So while a lot of clever stuffhappens in that time frame (SPF1, DNS RBL, RFC 2822 validation, mimeparsing and validation) it is still pretty clear that what is _really_happening is a whole lot of waiting.

To extend this further: Why do I care if my transactions take 1 secondor 60 seconds? They are unlikely to take more than 7ms of CPU timeeach. The only ramification of 60 second transaction times is that youwill have 59993ms of lull instead of 993ms -- resulting in 16680concurrent SMTP sessions... That's well within the bounds of commodityhardware: http://www.kegel.com/c10k.html

To put all this in perspective, a million messages/hour taxes a storagesystem pretty heavily. Even with a nice storage setup, you would behard pressed to sustain more than 4 or 5 million messages/hourthroughput on a single Intel box. While you can't have unlimitedconcurrency on a system, you should be able to scale recent releases ofLinux, FreeBSD and even Windows up in excess of 200k file descriptors(read: sessions). If we do the math here, we see 5 millionmessage/hour and 200k open sessions gives us an maximum transactionduration of 144 seconds.

In my previous mail I said that transaction time was irrelevantassuming it is bounded and reasonable. 144 seconds is reasonable upperbound -- if you plan on pushing 5 million message / hour. Of course,if your throughput is less, your bound is higher.

SPF will catch more fraudulent spam than those RBLs, so
many of those RBL requests will disappear -- as the SPF test will fail
first.
That would be nice to believe this, but I personally doubt it. Thisis
only my opinion of course,  but SPF is not going to make other protocol
level methods obselete, and certainly SPF is not going to force thesiteoperators that they better get another line of (life) "work." Inthen, in
my view,  SPF or LMAP based propoals is a short term kludge with a high
barrier of entry vs what it trying to accomplish/address - a problemwith adismishing half life. If the ultimate result is not presumed, then weare
just beating a dead horse. :-)

You misunderstood. SPF won't obsolete other mechanisms. My point wasthat SPF will catch a lot of traffic that would otherwise be hit by aDNS RBL. So if I have 1000 client connections all from unique IPs,sending mail from @aol.com (which is very common), I have a niceresult. With DNS RBL first, I need to lookup these IP address.. WithSPF, I need to lookup AOL once, then apply. If I do my SPF check firstand used my cached AOL TXT record, I can refuse these connections with0 DNS requests -- as opposed to two (one A and one TXT) for DNS RBL.

Anyway thanks for your input. I was just providing my own insight inour
LMAP implementation - "Things to consider."

I completely agree with your point on making SPF records as efficientas possible. AOL's record, for example, is quite quick to processes.


While pobox.com is much more expensive ;-)

// Theo Schlossnagle
// Principal Engineer -- http://www.omniti.com/~jesus/
// Postal Engine -- http://www.postalengine.com/
// Ecelerity: fastest MTA on earth