Re: [Asrg] A New Plan for No Spam / Velocity Indicator

On Thu, 1 May 2003 14:07:39 -0600 (MDT) 
Vernon Schryver <vjs(_at_)calcite(_dot_)rhyolite(_dot_)com> wrote:

From: J C Lawrence <claw(_at_)kanga(_dot_)nu>

...  Loose summary as I recall them: The optimal number of RCPT TOs
per message for best performance outbound delivery is approximately
5, and is most definitely below 10.

Ok, requiring separate transactions would increase the load by 5
times.


No.  The arithmetic isn't that trivial, tho its close.  Think about it a
bit.  While it is a power curve, there's still a heck of a lot of MXes
in the thin end of the wedge, and even more tellingly, the majority of
MX collections are not modulo 5 (surprisingly its pretty close to 80%
aren't mod 5).  Without doing the full math (I'll leave that to the
enterprising reader), I'd say the actual factor should come out
somewhere between 2.5 and 3.  

My own non-scientific experiments with VERP with smaller lists support
that estimate.

...  Given a $2,000US commodity machine its just not that hard to
sustain 2,500 outbound deliveries a minute these days.

Agreed, today the CPU and disk costs are far lower.  However, if you
can run 2,500 outbound deliveries/minute at 5 addressees/transaction,
you'll only be able to run 500/minute at 1 addressee/transaction.


Nope.  The same math applies again.  2,500 deliveries a minute is not
quite 12,500 addresses delivered per minute due to the same power curve
and modulo problems.  Additionally it ignores the gain from chained SMTP
transactions over the same connection (TCP setup/teardown being a
non-trivial portion of the TCP overhead).

If each transaction involves a 5 KByte body and an ignored ~200 bytes
of SMTP commands and TCP/IP headers, then a T1 can carry about 38
deliveries/second or 2280 deliveries/minute.  That does not quite jibe
with 2500 deliveres/minute at 5 addressees/delivery, but it's the
right order of magnitude.  Perhaps the fact that I ignored DNS traffic
as as the TCP/IP headers and SMTP commands and responses is part of
the discrepancy.


Most of my systems are connected via multiple T3s to a local MAE, PAIX,
or other similarly well connected nexii.  Local bandwidth is not a
problem or significant factor for high sustained delivery rates.  A
solid localhost DNS cache with enforced longer minimum TLLs (if your
repeat rate is low) keeps DNS latencies and traffic down.  Add some
tuning of your spool strategies, slow MX domain routing and a few other
tricks and you can keep things running at capacity.

When you get to those sorts of ranges, and especially when you start
trying to push much last 3,500 deliveries per minute on a commodity box
(~$2K) its the design and management of the local IO chain that is your
killer.  You very quickly hit a point where throwing more RAM and CPU
into the mix doesn't gain much, but popping IO latency from your DASD is
the big gainer, and that tends to be expensive unless you play fast and
loose with the commit rules.  Wietse Venema demonstrated and charted
this space quite clearly a couple years back.

However, we're getting considerably off topic here.

-- 
J C Lawrence                
---------(*)                Satan, oscillate my metallic sonatas. 
claw(_at_)kanga(_dot_)nu               He lived as a devil, eh?           
http://www.kanga.nu/~claw/  Evil is a name of a foeman, as I live.
_______________________________________________
Asrg mailing list
Asrg(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/asrg