[Top] [All Lists]

Re: Mail Data termination

2011-08-21 15:29:11

--On Sunday, August 21, 2011 14:54 +0200 Paul Smith
<paul(_at_)pscs(_dot_)co(_dot_)uk> wrote:

If the receiver does load limiting by restricting the number
of open connections, then it is selfish for the sender to
presume that it can keep hold of one of these connections for
longer than necessary.

I agree that a few seconds is probably not significant, but if
5 seconds grows to 15 seconds, then 60 seconds etc, then it
could become a big issue for some servers. OK, so the senders
are also having larger numbers of open connections, but this
is under their control, not the receiver's.

The sender also has the privilege of being able to decide when
and what to send, whereas the receiver has to respond in a
timely manner to the sender's requests - thus the sender is
under much less pressure than the receiver. The sender can
keep 1000 connections open and know that it will only send to
one at once, but if the receiver has 1000 connections open it
has to be able to respond to all of them quickly, even if all
1000 decide to send a message at the same time. The receiver
has to respond quickly, to stop the sender (which generally
use very short timeouts, regardless of what 5321 says) from
timing out prematurely and retrying unnecessarily.


First, no one has really suggested "grows to 15 seconds, then 60
seconds etc.".  As far as I can tell from this thread, there is
general agreement that deliberately waiting a minute or more
would be abusive.

Second, since you have already moved us into the territory of
"... regardless of what 5321 saves", let me suggest something
for the design of such receivers.  Note that, while I suggest
doing it, I'd be strongly opposed to suggestions to incorporate
it into 5321bis.  As you have probably gathered from my earlier
note, I'm rather a big fan of basing operational decisions on
statistical prediction from recent history.  It would seem
perfectly rational to me for a receiver to measure and keep
track of how long it takes a sender within a given SMTP session
to issue another command after the receiver sends back a reply.
For an SMTP session, that gives the receiver an absolute minimum
of five measurements before EOD arrives (EHLO, MAIL, RCPT, DATA.
beginning of data stream after the 354 is sent).  For a
multiple-recipient message or multiple messages in the same
session, the number of measurements gets bigger.   Now, suppose
the median of the measurements taken is X and the first upper
hinge* is Y.  I'd think that a receiver, after sending a 250
response to EOD, would be perfectly rational to assume that the
sender had either lost touch with the connection or was screwing
with it after waiting an order of magnitude or so more than Y.  

Note that, if the sender took a half-second to respond with a
new command (or data stream) after receiving a response code and
its response times were fairly stable, the five or six seconds
people have been talking about would be pretty much on target
with that model of analysis.   Whether one used that sort of
formula or not, the point is that, if one were a receiver
concerned enough about the interaction of per-sender behavior
with attacks or abuse, it would be pretty easy to differentiate
between a sender that was holding a connection open deliberately
and for an excessive time to see if more traffic came in and one
that was just slow.  What to do with that knowledge would
obviously be a design or operational issue for the receiver.

This is _not_ a standards problem, at least IMO.


<Prev in Thread] Current Thread [Next in Thread>