Re: Deliver-qmails-ondemand

On Tue, 28 Oct 2003 19:35:01 +0530, Madan Ganesh Velayudham said:

      Yes, I understand RFC 1985, talks about Server and Client 
      Interaction. And It is not clear when the client should 
      initiate ETRN.


For SMTP, "client" is always the end starting the connection, and "server" is
the end receiving.  This means that the machine you call your "SMTP server" is
actually operating in both modes all the time - when it accepts inbound mail it
runs in server mode, and (the case you are interested in) when it sends mail
out, it's in client mode.

To avoid this confusion, we usually talk about Mail User Agents (MUAs - the
code that actually talks to the user, usually on a desktop/laptop/etc), Message
Submission Agents (MSAs - the code that accepts mail from an MUA),
Message Transmission Agents (MTAs - the code that moves mail from system
to system), and Message Delivery Agents (MDAs - the code that actually gets
mail into a user's mailbox).  It is quite common for one piece of software to
do multiple parts of this process.

      My comment is about when the SMTP server brings up, it should
      inform its neighbour SMTP servers.


When your MTA comes up, it may want to connect to its neighbor MTAs in client
mode and issue an ETRN to tell them that it is now up, and to initiate a queue
run on the neighboring boxes now rather than at the next scheduled time.

As an operational note, this can be *very* dangerous in high-volume
environments if you have multiple neighbors with large queues., particularly
with high-speed network connections and SMTP Pipeline enabled at both ends.

The basic problem is that most high-volume mail systems are Unix or Linux
based, and often they include the system load average in the "am I too busy?"
calculation. Unfortunately, said average is time-smoothed, and represents the
average load over the last 1/5/15 minutes.  So the server end receives a client
connection, which then starts sending the mail *very* quickly (it's possible to
get well into the hundreds/second if you try hard), while fork()/exec()ing in
the background to do final delivery.  By the time the load average starts
reflecting the problem, the server has already done dozens/hundreds/thousands
of fork/execs and the load average then proceeds to spike.  I've personally
seen an IBM RS6K-250 (1 133mz 601 processor) (our Listserv box at one time)
bring a large 8-CPU Sun (our main mail hub) to its knees - the fact that the
*average* number of recipients per message was in the hundreds didn't help...

pgpAfSrRm6PVd.pgp
Description: PGP signature