ietf-smtp
[Top] [All Lists]

Re: We need an IETF BCP for GREY LISTING

2011-10-18 09:02:49

Tim Kehres wrote:

Progressive backoff strategies were put in place precisely for this reason - but some care needs to be taken to choose reasonable timing values.

Greylisting sites that configure timeout windows that greatly exceed the norm for remote MTA retry times (hours or more for instance) are also improperly configured and will not play well with MTA's with reasonable backoff times.

A BCP that describes various queuing strategies in terms of retry times, and the relationship between MTA timing and Greylist delay windows might be a good thing for a BCP. It's really not all that difficult, however sometimes putting these things into writing can help new administrators. The BCP can then formalize what the community considers "reasonable" values for MTA's and GL running servers.

Anything else I'm really at a loss to understand what we are trying to fix. Greylisting for the most part works and other than the queue timing issues above seems largely to be a non-issue.

Best Regards,

This group seems to make things more complicated than necessary. You're right. It isn't that difficult. I called for a BCP that would provide the reasonable guidelines we would expect to be reasonable, with a goal to alleviate the growing overhead. You stated your points as client and server poor choices. A BCP should help, especially as you noted with new GL implementators.

Lets note, there is overhead redundancy when it has becomes a "guessing" game and the Progressive Backoff strategies don't alway apply. The higher volume systems will see this. A large mail system doesn't need to support GL but they also feel it in their outbound mail.

It might just become what is a "reasonable initial delay?"

    1 minute?
    5 minutes?
   10 minutes?
   20 minutes?

etc.

There are many GL servers that provide a "retry time hint" in some form, hence we should consider if this can help clients lower their wasteful attempts and also help server's availability.

Suppose the MTA default attempts frequency is:

   1st immediate
   2nd 1 minutes
   3rd 4 mins
   4th 15 mins
   ...

and the server issues a response on the first attempt:

   451 4.7.1 Greylisting in action, please come back in 00:10:00

then the MTA is not going to make it until the 4th attempt, 20 minutes later.

Sure, not unreasonable one may suggest, and one may suggest that the 1 min 2nd attempt is too short, but it is overhead and in volume, it is much greater.

So yesterday, I put into place logic into our MTA to extract the retry time hints from the various forms of GL responses with time hints and so far, it seems to work. I get the same remote that provided the above response, the MTA extracted the 10 minutes and rescheduled the 2nd attempt 10 minutes later. I waited the 10 mins (plus a few seconds to start the engine) and poof! The 2nd attempt was accepted.

I'm using this C code (with test and expected output) to parse the 4yz response:

#include <stdio.h>
#include <windows.h>

//-----------------------------------------------------------
// int GetRetrySecs(const char *psz)
//
// return # of seconds from formats
//
//  HH:MM:SS
//  # seconds
//  # minutes
//  # hours
//
//-----------------------------------------------------------

int GetRetrySecs(const char *psz)
{
    char *pOut = NULL;
    int secs = strtol(psz,&pOut,10);
    if (*pOut == ':') {
       secs = secs*3600+strtol(pOut+1,&pOut,10)*60;
       if (*pOut == ':') secs += atoi(pOut+1);
       return secs;
    }
    if (secs) {
       if (*pOut == ' ') pOut++;
       if (*pOut == 'm' || *pOut == 'M') secs=secs*60;
       if (*pOut == 'h' || *pOut == 'H') secs=secs*3600;
    }
    return secs;
}

//-----------------------------------------------------------
// int ExtractRetryHint(const char *psz)
//
// For given SMTP 4yz response, return any known retry time
// hint in seconds. Return 0 for no time hint or non 4yz
// response.
//-----------------------------------------------------------

char *stristr(const char *string1, const char *string2)
{
    if (!string1 || !string2) {
        return NULL;
    }
    char *copy1 = _strupr(_strdup(string1));
    char *copy2 = _strupr(_strdup(string2));
    char *result = strstr(copy1, copy2);
    if (result) result = (char *)string1 + (result - copy1);
    free(copy1);
    free(copy2);
    return result;
}

int ExtractRetryHint(const char *psz)
{
    if (!psz || !psz[0] || ((atoi(psz)/100*100) != 400)) {
        return 0;
    }
    const char *tags[] = {
        "retry=",
        "Greylisted for ",
        "try again in ",
        "please come back in ",
        NULL};
    const char **p = tags;
    while (*p) {
      char *pos = stristr(psz,*p);
      if (pos) return GetRetrySecs(pos+strlen(*p));
      p++;
    }
    return 0;
}

void test(const char *sz)
{
    int secs = ExtractRetryHint(sz);
    printf("secs: %5d | %s\n",secs, sz);
}

void main(char argc, char *argv[])
{
test("421 This server implements greylisting, please try again in 90 seconds"); test("450 4.7.1 <RCPT>: Recipient address rejected: Greylisted for 5 minutes"); test("450 4.7.1 <RCPT>: Recipient address rejected: Greylisted for 120 seconds"); test("451 4.7.1 Greylisting in action, please come back in 00:30:00"); test("451 4.7.1 Greylisting in action, PLEASE COME BACK in 01:05:20");
    test("451 Greylisted for 55 seconds");
    test("451 Greylisted, please try again in 167 seconds");
    test("451 Greylisting enabled, try again in 22 minutes");
    test("451 Greylisting. Try again later.");
    test("451 Greylisting. Try again later. retry=5 mins");
    test("471 Greylisting. Try again later. retry=300 secs");
    test("471 Greylisting. Try again later. retry=2 hours");
    test("471 Greylisting. Try again later. retry=2h");
    test("471 Greylisting. Try again later. retry=120");
    test("471 Greylisting. Try again later. retry=01:01:02");
    test("471 retry=01:01:02 Greylisting. Try again later.");
    test("471 Greylist: RETRY=1M Try again later.");
    test("550 Permament Reject");
}

Expected output:

secs: 90 | 421 This server implements greylisting, please try again in 90 seconds secs: 300 | 450 4.7.1 <RCPT>: Recipient address rejected: Greylisted for 5 minutes secs: 120 | 450 4.7.1 <RCPT>: Recipient address rejected: Greylisted for 120 seconds secs: 1800 | 451 4.7.1 Greylisting in action, please come back in 00:30:00 secs: 3920 | 451 4.7.1 Greylisting in action, PLEASE COME BACK in 01:05:20
secs:    55 | 451 Greylisted for 55 seconds
secs:   167 | 451 Greylisted, please try again in 167 seconds
secs:  1320 | 451 Greylisting enabled, try again in 22 minutes
secs:     0 | 451 Greylisting. Try again later.
secs:   300 | 451 Greylisting. Try again later. retry=5 mins
secs:   300 | 471 Greylisting. Try again later. retry=300 secs
secs:  7200 | 471 Greylisting. Try again later. retry=2 hours
secs:  7200 | 471 Greylisting. Try again later. retry=2h
secs:   120 | 471 Greylisting. Try again later. retry=120
secs:  3662 | 471 Greylisting. Try again later. retry=01:01:02
secs:  3662 | 471 retry=01:01:02 Greylisting. Try again later.
secs:    60 | 471 Greylist: RETRY=1M Try again later.
secs:     0 | 550 Permament Reject

I plan to do at least 1 week of internal testing before I select some of the larger volume field testers that are getting hit harder with GL servers.

There are many responses with hints that says:

S: 421 This server implements greylisting, please try again in 300 seconds
 S: 451 Greylisted, please try again in 600 seconds
 S: 451 Greylisted, please try again in 511 seconds
 S: 451 Greylisted, please try again in 304 seconds
 S: 451 Greylisted, please try again in 900 seconds
 S: 451 Greylisted, please try again in 700 seconds
 S: 451 Greylisted, please try again in 600 seconds
 S: 451 Greylisted, please try again in 900 seconds
 S: 451 Greylisted, please try again in 491 seconds
 S: 451 Greylisted, please try again in 511 seconds
 S: 451 Greylisted, please try again in 873 seconds
 S: 451 Greylisted, please try again in 298 seconds
 S: 451 Greylisted, please try again in 628 seconds
 S: 451 Greylisting enabled, try again in 1 minutes
 S: 451 Greylisting enabled, try again in 5 minutes
S: 450 4.7.1 <xxxxxxx(_at_)xxx(_dot_)com>: Recipient address rejected: Greylisted for 5 minutes
 S: 451 Greylisting enabled, try again in 5 minutes
 S: 451 Greylisting enabled, try again in 2 minutes

Many have hints between 1 to 5 minutes so its one can say that a guess of 1 min and 4 mins is not "reasonable," and these will be best for "faster" timely delivery. A smaller system can possibly afford some wasted attempts. I think it becomes more overhead with larger volume.

Overall, I prefer not to guess and waste attempts if the GL server can assist with a more "appropriate" blocking time and if I can add the GL MTA feature for our customers to enable, then I believe they will benefit.

Off hand, 5 minutes seem to be common, but as you see see, there are those with 15 minutes initial delays.