Re: We need an IETF BCP for GREY LISTING


Tim Kehres wrote:

Progressive backoff strategieswere put in place precisely for this reason - but some care needs to betaken to choose reasonable timing values.
Greylisting sites that configure timeout windows that greatly exceed thenorm for remote MTA retry times (hours or more for instance) are alsoimproperly configured and will not play well with MTA's with reasonablebackoff times.
A BCP that describes various queuing strategies in terms of retry times,and the relationship between MTA timing and Greylist delay windows mightbe a good thing for a BCP. It's really not all that difficult, howeversometimes putting these things into writing can help new administrators.The BCP can then formalize what the community considers "reasonable"values for MTA's and GL running servers.
Anything else I'm really at a loss to understand what we are trying tofix. Greylisting for the most part works and other than the queue timingissues above seems largely to be a non-issue.
Best Regards,

This group seems to make things more complicated than necessary.You're right. It isn't that difficult. I called for a BCP that wouldprovide the reasonable guidelines we would expect to be reasonable,with a goal to alleviate the growing overhead. You stated your pointsas client and server poor choices. A BCP should help, especially asyou noted with new GL implementators.

Lets note, there is overhead redundancy when it has becomes a"guessing" game and the Progressive Backoff strategies don't alwayapply. The higher volume systems will see this. A large mail systemdoesn't need to support GL but they also feel it in their outbound mail.


It might just become what is a "reasonable initial delay?"

    1 minute?
    5 minutes?
   10 minutes?
   20 minutes?

etc.

There are many GL servers that provide a "retry time hint" in someform, hence we should consider if this can help clients lower theirwasteful attempts and also help server's availability.


Suppose the MTA default attempts frequency is:

   1st immediate
   2nd 1 minutes
   3rd 4 mins
   4th 15 mins
   ...

and the server issues a response on the first attempt:

   451 4.7.1 Greylisting in action, please come back in 00:10:00

then the MTA is not going to make it until the 4th attempt, 20 minuteslater.

Sure, not unreasonable one may suggest, and one may suggest that the 1min 2nd attempt is too short, but it is overhead and in volume, it ismuch greater.

So yesterday, I put into place logic into our MTA to extract the retrytime hints from the various forms of GL responses with time hints andso far, it seems to work. I get the same remote that provided theabove response, the MTA extracted the 10 minutes and rescheduled the2nd attempt 10 minutes later. I waited the 10 mins (plus a fewseconds to start the engine) and poof! The 2nd attempt was accepted.

I'm using this C code (with test and expected output) to parse the 4yzresponse:


#include <stdio.h>
#include <windows.h>

//-----------------------------------------------------------
// int GetRetrySecs(const char *psz)
//
// return # of seconds from formats
//
//  HH:MM:SS
//  # seconds
//  # minutes
//  # hours
//
//-----------------------------------------------------------

int GetRetrySecs(const char *psz)
{
    char *pOut = NULL;
    int secs = strtol(psz,&pOut,10);
    if (*pOut == ':') {
       secs = secs*3600+strtol(pOut+1,&pOut,10)*60;
       if (*pOut == ':') secs += atoi(pOut+1);
       return secs;
    }
    if (secs) {
       if (*pOut == ' ') pOut++;
       if (*pOut == 'm' || *pOut == 'M') secs=secs*60;
       if (*pOut == 'h' || *pOut == 'H') secs=secs*3600;
    }
    return secs;
}

//-----------------------------------------------------------
// int ExtractRetryHint(const char *psz)
//
// For given SMTP 4yz response, return any known retry time
// hint in seconds. Return 0 for no time hint or non 4yz
// response.
//-----------------------------------------------------------

char *stristr(const char *string1, const char *string2)
{
    if (!string1 || !string2) {
        return NULL;
    }
    char *copy1 = _strupr(_strdup(string1));
    char *copy2 = _strupr(_strdup(string2));
    char *result = strstr(copy1, copy2);
    if (result) result = (char *)string1 + (result - copy1);
    free(copy1);
    free(copy2);
    return result;
}

int ExtractRetryHint(const char *psz)
{
    if (!psz || !psz[0] || ((atoi(psz)/100*100) != 400)) {
        return 0;
    }
    const char *tags[] = {
        "retry=",
        "Greylisted for ",
        "try again in ",
        "please come back in ",
        NULL};
    const char **p = tags;
    while (*p) {
      char *pos = stristr(psz,*p);
      if (pos) return GetRetrySecs(pos+strlen(*p));
      p++;
    }
    return 0;
}

void test(const char *sz)
{
    int secs = ExtractRetryHint(sz);
    printf("secs: %5d | %s\n",secs, sz);
}

void main(char argc, char *argv[])
{

test("421 This server implements greylisting, please try again in90 seconds");test("450 4.7.1 <RCPT>: Recipient address rejected: Greylistedfor 5 minutes");test("450 4.7.1 <RCPT>: Recipient address rejected: Greylistedfor 120 seconds");test("451 4.7.1 Greylisting in action, please come back in00:30:00");test("451 4.7.1 Greylisting in action, PLEASE COME BACK in01:05:20");

    test("451 Greylisted for 55 seconds");
    test("451 Greylisted, please try again in 167 seconds");
    test("451 Greylisting enabled, try again in 22 minutes");
    test("451 Greylisting. Try again later.");
    test("451 Greylisting. Try again later. retry=5 mins");
    test("471 Greylisting. Try again later. retry=300 secs");
    test("471 Greylisting. Try again later. retry=2 hours");
    test("471 Greylisting. Try again later. retry=2h");
    test("471 Greylisting. Try again later. retry=120");
    test("471 Greylisting. Try again later. retry=01:01:02");
    test("471 retry=01:01:02 Greylisting. Try again later.");
    test("471 Greylist: RETRY=1M Try again later.");
    test("550 Permament Reject");
}

Expected output:

secs: 90 | 421 This server implements greylisting, please try againin 90 secondssecs: 300 | 450 4.7.1 <RCPT>: Recipient address rejected: Greylistedfor 5 minutessecs: 120 | 450 4.7.1 <RCPT>: Recipient address rejected: Greylistedfor 120 secondssecs: 1800 | 451 4.7.1 Greylisting in action, please come back in00:30:00secs: 3920 | 451 4.7.1 Greylisting in action, PLEASE COME BACK in01:05:20

secs:    55 | 451 Greylisted for 55 seconds
secs:   167 | 451 Greylisted, please try again in 167 seconds
secs:  1320 | 451 Greylisting enabled, try again in 22 minutes
secs:     0 | 451 Greylisting. Try again later.
secs:   300 | 451 Greylisting. Try again later. retry=5 mins
secs:   300 | 471 Greylisting. Try again later. retry=300 secs
secs:  7200 | 471 Greylisting. Try again later. retry=2 hours
secs:  7200 | 471 Greylisting. Try again later. retry=2h
secs:   120 | 471 Greylisting. Try again later. retry=120
secs:  3662 | 471 Greylisting. Try again later. retry=01:01:02
secs:  3662 | 471 retry=01:01:02 Greylisting. Try again later.
secs:    60 | 471 Greylist: RETRY=1M Try again later.
secs:     0 | 550 Permament Reject

I plan to do at least 1 week of internal testing before I select someof the larger volume field testers that are getting hit harder with GLservers.


There are many responses with hints that says:

S: 421 This server implements greylisting, please try again in 300seconds

 S: 451 Greylisted, please try again in 600 seconds
 S: 451 Greylisted, please try again in 511 seconds
 S: 451 Greylisted, please try again in 304 seconds
 S: 451 Greylisted, please try again in 900 seconds
 S: 451 Greylisted, please try again in 700 seconds
 S: 451 Greylisted, please try again in 600 seconds
 S: 451 Greylisted, please try again in 900 seconds
 S: 451 Greylisted, please try again in 491 seconds
 S: 451 Greylisted, please try again in 511 seconds
 S: 451 Greylisted, please try again in 873 seconds
 S: 451 Greylisted, please try again in 298 seconds
 S: 451 Greylisted, please try again in 628 seconds
 S: 451 Greylisting enabled, try again in 1 minutes
 S: 451 Greylisting enabled, try again in 5 minutes

S: 450 4.7.1 <xxxxxxx(_at_)xxx(_dot_)com>: Recipient address rejected:Greylisted for 5 minutes

 S: 451 Greylisting enabled, try again in 5 minutes
 S: 451 Greylisting enabled, try again in 2 minutes

Many have hints between 1 to 5 minutes so its one can say that a guessof 1 min and 4 mins is not "reasonable," and these will be best for"faster" timely delivery. A smaller system can possibly afford somewasted attempts. I think it becomes more overhead with larger volume.

Overall, I prefer not to guess and waste attempts if the GL server canassist with a more "appropriate" blocking time and if I can add the GLMTA feature for our customers to enable, then I believe they will benefit.

Off hand, 5 minutes seem to be common, but as you see see, there arethose with 15 minutes initial delays.