Re: After a 450, queue or try next MX?



From: "Alex van den Bogaerdt"

On Wed, Aug 30, 2006 at 03:49:40PM -0400, Hector Santos wrote:

In our system, we only go to the next record when there is a
connection failure. Otherwise we follow the wishes of 45x or 550.
45x to try again LATER, not within the same transaction attempt
where you have a list of MX to try.


Which definitely sounds reasonable, but would cause unnecessary
delays when the MX itself has a problem.

Consider "450  I am having a problem reaching the DMZ"


Isn't a 421, 554 more appropriate when there is "operational" issues?

Agreed, but they are still open to differences in interpretation.

two MX hosts:
  0 mail1.example.com
  1 mail2.example.com

I could argue that one single delivery attempt involves trying both hosts.


Right, we also use to do this in the past and didn't work too well :-)

You are assuming one failed connection to mail1.example.com means one
failed delivery attempt.  I see no such evidence.


No, in our current default setup model, 1 attempt is the entire expanded MX
list.  But we still do have the old option

        [_] Count each MX as 1 attempt. (off by default)

which was tied to when it use to try all the MX list once upon a time.

RFC1123 again repeats the statement about trying all MXes, in order,
until a delivery attempt succeeds.

It does not say "until a connection to the SMTP daemon succeeds".


no doubt, there is no rule. So I can only express our own experiences that
was finely tuned over decades. I think we have settled on a working default
setup that works best by default for our customers.

I also think it is also a "common sense" if a system is down,  it is also
going to expose this at the connection level or as a response to the
connection.   Once the session is established, 99% of the time either the
mail is accepted or not. 45x or 55x is best used for that.  If the session
failed at some state beyond HELO with something  other than 45x, 55x, then I
can see maybe trying the next MX.   I current model reflects this logic.

two MX hosts:
  0 mail1.example.com
  1 mail2.example.com

Suppose mail1.example.com returns "450 failed for whatever reason".  If
you now decide to wait 5 minutes (or 1, or 10) you still MUST(!) try mail2
the next time.  This is the only way I can interpret RFC1123 section 5.3.4
(reliable mail transmission).


That is why I think, it is easier to just code it based on operational
FAILURE (not a 45x, 55x session failure).  I still think the more
appropriate response above is a 421, not 450 if you want the other end to
try the next MX again.   I think our code will see the 42x response to
continue with the next MX, if any.  I have to check that.  Let me see...
Yup, the 421 will make it go to the next MX.

--
Hector Santos, Santronics Software, Inc.
http://www.santronics.com