Peter J. Holzer wrote:
Half an hour is ok if the reason for the temporary failure is a problem
which requires human intervention. It is already questionable if the
reason is that server is too busy (in my experience load spikes usually
last only a few minutes, but YMMV).
My data shows that too. They come in random waves with no pattern
(that I can eyeball measure) when they come.
It is definitely too long for greylisting:
In contrast, consider that case that Hectors server (which uses an
initial blocking time of 55 seconds, as he wrote) can tell the client
"try again in 56 seconds" and that the client can do this. Then the
greylisted mail will get through after 56 seconds (+ probably a bit of
jitter added by the queue runner), and Hectors users will be happy
because they get the information they requested on the phone while they
are still talking instead of half an hour later. It also allows mail
admins to keep the default retry interval at a relatively high value and
avoid pounding servers which are already overloaded.
Right. And yet, even with the 55 secs delay, we don't know what the
customer's mail system MTA 2nd retry time will be. They could be using
the SMTP recommendations or did they adjust to the greylisting market
and have a 1-5 minutes for at least the 2nd retry? We don't know.
I also see some applications outside of greylisting, too:
For example, I've occassionally have to return a 4xx error to some
recipients of a multi-recipient message, because they couldn't be
processed together in the same transaction. In this case it would be
nice if I could tell the client that it is ok to immediately retry for
Another reason for unexpected mail delays is if you try to open more
connections than the receiving MX allows. The mails on some connections
get rejected with a temporary error and go back into the queue (possibly
several times). They could almost certainly be sent immediately over one
of the existing connections, but the client doesn't know that. If the
server could convey that information to the client, a wasteful delay
could be avoided.
I found it to be rare, but I have logs showing remote servers issuing
a 421 with greylisting information in the text, including informal
time hints as oppose to just saying "too many connections."