Tim Kehres wrote:
Progressive backoff strategies
were put in place precisely for this reason - but some care needs to be
taken to choose reasonable timing values.
Greylisting sites that configure timeout windows that greatly exceed the
norm for remote MTA retry times (hours or more for instance) are also
improperly configured and will not play well with MTA's with reasonable
backoff times.
A BCP that describes various queuing strategies in terms of retry times,
and the relationship between MTA timing and Greylist delay windows might
be a good thing for a BCP. It's really not all that difficult, however
sometimes putting these things into writing can help new administrators.
The BCP can then formalize what the community considers "reasonable"
values for MTA's and GL running servers.
Anything else I'm really at a loss to understand what we are trying to
fix. Greylisting for the most part works and other than the queue timing
issues above seems largely to be a non-issue.
Best Regards,
This group seems to make things more complicated than necessary.
You're right. It isn't that difficult. I called for a BCP that would
provide the reasonable guidelines we would expect to be reasonable,
with a goal to alleviate the growing overhead. You stated your points
as client and server poor choices. A BCP should help, especially as
you noted with new GL implementators.
Lets note, there is overhead redundancy when it has becomes a
"guessing" game and the Progressive Backoff strategies don't alway
apply. The higher volume systems will see this. A large mail system
doesn't need to support GL but they also feel it in their outbound mail.
It might just become what is a "reasonable initial delay?"
1 minute?
5 minutes?
10 minutes?
20 minutes?
etc.
There are many GL servers that provide a "retry time hint" in some
form, hence we should consider if this can help clients lower their
wasteful attempts and also help server's availability.
Suppose the MTA default attempts frequency is:
1st immediate
2nd 1 minutes
3rd 4 mins
4th 15 mins
...
and the server issues a response on the first attempt:
451 4.7.1 Greylisting in action, please come back in 00:10:00
then the MTA is not going to make it until the 4th attempt, 20 minutes
later.
Sure, not unreasonable one may suggest, and one may suggest that the 1
min 2nd attempt is too short, but it is overhead and in volume, it is
much greater.
So yesterday, I put into place logic into our MTA to extract the retry
time hints from the various forms of GL responses with time hints and
so far, it seems to work. I get the same remote that provided the
above response, the MTA extracted the 10 minutes and rescheduled the
2nd attempt 10 minutes later. I waited the 10 mins (plus a few
seconds to start the engine) and poof! The 2nd attempt was accepted.
I'm using this C code (with test and expected output) to parse the 4yz
response:
#include <stdio.h>
#include <windows.h>
//-----------------------------------------------------------
// int GetRetrySecs(const char *psz)
//
// return # of seconds from formats
//
// HH:MM:SS
// # seconds
// # minutes
// # hours
//
//-----------------------------------------------------------
int GetRetrySecs(const char *psz)
{
char *pOut = NULL;
int secs = strtol(psz,&pOut,10);
if (*pOut == ':') {
secs = secs*3600+strtol(pOut+1,&pOut,10)*60;
if (*pOut == ':') secs += atoi(pOut+1);
return secs;
}
if (secs) {
if (*pOut == ' ') pOut++;
if (*pOut == 'm' || *pOut == 'M') secs=secs*60;
if (*pOut == 'h' || *pOut == 'H') secs=secs*3600;
}
return secs;
}
//-----------------------------------------------------------
// int ExtractRetryHint(const char *psz)
//
// For given SMTP 4yz response, return any known retry time
// hint in seconds. Return 0 for no time hint or non 4yz
// response.
//-----------------------------------------------------------
char *stristr(const char *string1, const char *string2)
{
if (!string1 || !string2) {
return NULL;
}
char *copy1 = _strupr(_strdup(string1));
char *copy2 = _strupr(_strdup(string2));
char *result = strstr(copy1, copy2);
if (result) result = (char *)string1 + (result - copy1);
free(copy1);
free(copy2);
return result;
}
int ExtractRetryHint(const char *psz)
{
if (!psz || !psz[0] || ((atoi(psz)/100*100) != 400)) {
return 0;
}
const char *tags[] = {
"retry=",
"Greylisted for ",
"try again in ",
"please come back in ",
NULL};
const char **p = tags;
while (*p) {
char *pos = stristr(psz,*p);
if (pos) return GetRetrySecs(pos+strlen(*p));
p++;
}
return 0;
}
void test(const char *sz)
{
int secs = ExtractRetryHint(sz);
printf("secs: %5d | %s\n",secs, sz);
}
void main(char argc, char *argv[])
{
test("421 This server implements greylisting, please try again in
90 seconds");
test("450 4.7.1 <RCPT>: Recipient address rejected: Greylisted
for 5 minutes");
test("450 4.7.1 <RCPT>: Recipient address rejected: Greylisted
for 120 seconds");
test("451 4.7.1 Greylisting in action, please come back in
00:30:00");
test("451 4.7.1 Greylisting in action, PLEASE COME BACK in
01:05:20");
test("451 Greylisted for 55 seconds");
test("451 Greylisted, please try again in 167 seconds");
test("451 Greylisting enabled, try again in 22 minutes");
test("451 Greylisting. Try again later.");
test("451 Greylisting. Try again later. retry=5 mins");
test("471 Greylisting. Try again later. retry=300 secs");
test("471 Greylisting. Try again later. retry=2 hours");
test("471 Greylisting. Try again later. retry=2h");
test("471 Greylisting. Try again later. retry=120");
test("471 Greylisting. Try again later. retry=01:01:02");
test("471 retry=01:01:02 Greylisting. Try again later.");
test("471 Greylist: RETRY=1M Try again later.");
test("550 Permament Reject");
}
Expected output:
secs: 90 | 421 This server implements greylisting, please try again
in 90 seconds
secs: 300 | 450 4.7.1 <RCPT>: Recipient address rejected: Greylisted
for 5 minutes
secs: 120 | 450 4.7.1 <RCPT>: Recipient address rejected: Greylisted
for 120 seconds
secs: 1800 | 451 4.7.1 Greylisting in action, please come back in
00:30:00
secs: 3920 | 451 4.7.1 Greylisting in action, PLEASE COME BACK in
01:05:20
secs: 55 | 451 Greylisted for 55 seconds
secs: 167 | 451 Greylisted, please try again in 167 seconds
secs: 1320 | 451 Greylisting enabled, try again in 22 minutes
secs: 0 | 451 Greylisting. Try again later.
secs: 300 | 451 Greylisting. Try again later. retry=5 mins
secs: 300 | 471 Greylisting. Try again later. retry=300 secs
secs: 7200 | 471 Greylisting. Try again later. retry=2 hours
secs: 7200 | 471 Greylisting. Try again later. retry=2h
secs: 120 | 471 Greylisting. Try again later. retry=120
secs: 3662 | 471 Greylisting. Try again later. retry=01:01:02
secs: 3662 | 471 retry=01:01:02 Greylisting. Try again later.
secs: 60 | 471 Greylist: RETRY=1M Try again later.
secs: 0 | 550 Permament Reject
I plan to do at least 1 week of internal testing before I select some
of the larger volume field testers that are getting hit harder with GL
servers.
There are many responses with hints that says:
S: 421 This server implements greylisting, please try again in 300
seconds
S: 451 Greylisted, please try again in 600 seconds
S: 451 Greylisted, please try again in 511 seconds
S: 451 Greylisted, please try again in 304 seconds
S: 451 Greylisted, please try again in 900 seconds
S: 451 Greylisted, please try again in 700 seconds
S: 451 Greylisted, please try again in 600 seconds
S: 451 Greylisted, please try again in 900 seconds
S: 451 Greylisted, please try again in 491 seconds
S: 451 Greylisted, please try again in 511 seconds
S: 451 Greylisted, please try again in 873 seconds
S: 451 Greylisted, please try again in 298 seconds
S: 451 Greylisted, please try again in 628 seconds
S: 451 Greylisting enabled, try again in 1 minutes
S: 451 Greylisting enabled, try again in 5 minutes
S: 450 4.7.1 <xxxxxxx(_at_)xxx(_dot_)com>: Recipient address rejected:
Greylisted for 5 minutes
S: 451 Greylisting enabled, try again in 5 minutes
S: 451 Greylisting enabled, try again in 2 minutes
Many have hints between 1 to 5 minutes so its one can say that a guess
of 1 min and 4 mins is not "reasonable," and these will be best for
"faster" timely delivery. A smaller system can possibly afford some
wasted attempts. I think it becomes more overhead with larger volume.
Overall, I prefer not to guess and waste attempts if the GL server can
assist with a more "appropriate" blocking time and if I can add the GL
MTA feature for our customers to enable, then I believe they will benefit.
Off hand, 5 minutes seem to be common, but as you see see, there are
those with 15 minutes initial delays.