Re: draft-atkins-smtp-traffic-control




--On Wednesday, November 02, 2011 19:25 +0800 Tim Kehres
<tim(_at_)kehres(_dot_)com> wrote:

From: "John C Klensin" <john+smtp(_at_)jck(_dot_)com>

The "retry strategy" text in RFC 5321 was written more on the
assumption that most cases would be connection failures,
rather than 4yz-induced retries.  It does not differentiate
between the two because, when 4yz really meant "system down
or going down" or "no service right now" on a global, rather
than
sender-selective basis, it was reasonable to assume that a
retry within a short period of time would get a connection
failure.


Understood, however the use of 4yz during the RCPT TO: phase
has been common for a long time now.   5321 seems to also
acknowledge this in stating the following definition for the
450 response:

   450  Requested mail action not taken: mailbox unavailable
(e.g.,
        mailbox busy or temporarily blocked for policy reasons)


Yes.  I know.  I wrote that text.  I'm not suggesting that this
newer usage is invalid, or even wrong.  I'm just suggesting that
(i) it is not precisely in line with the original model (which
may or may not be a problem) and (ii) that we did not rewrite
the discussion of the retry/queuing model, nor the maximum and
minimum timeout discussions, to reflect uses radically different
from "machine going down".  Perhaps we should have done that,
but we didn't.  From one point of view, the problem with this
practice vis-a-vis 5321 isn't that the statement you cite is
there but that the WG didn't follow up with a reanalysis of
waiting times.

Am I being too simplistic in thinking we could formalize a
retry= extension that could be used here (as well as other
contexts)?   In practice many MTA's simply tag each queue item
with a timestamp at which time retries can be valid with
actual queue runs being scheduled on a regular basis (skipping
over messages not available yet for retry).  For these types
of systems having the client read the response and adjust the
retry time tag should not be that difficult, and not rely upon
any queuing system changes.  This could apply for greylisting,
server not accepting mail (for whatever reasons), pizza and
beer break, or other unanticipated outages (either planned or
automated).   On the protocol side, no need for extending the
basic response codes either.


Let me try to be clear.   Again, I'm not advocating anything (or
pushing back on anything).  I just want to encourage people to
examine all of the options and tradeoffs.  So...

(1) Could we formalize such an extension?  Yes.  No problem.
Well within the extension model.  Whether we can reach consensu
on the details is a separate question but I believe the answer
is probably "yes", especially if it isn't purely an anti-spam
measure.

(2) Is it necessary to extend the basic response codes?  No.
There might be some advantages (as well as disadvantages) to
doing so.  IMO, someone needs to work out the use cases, the
relationships among various combinations of extended and
unextended servers, and so on and sort that out.

(3) Would such an extension be useful enough in practice, and
implemented and deployed enough in practice, to be worth the
trouble?  I don't know and don't have an opinion.

best regards,
    john