ietf-mxcomp
[Top] [All Lists]

TECH: use fetchmail algorithm to select header address to verify

2004-09-06 12:30:33

The widely used fetchmail package includes a section of code that
determines the most likely responsible party for an e-mail message.
Its algorithm is similar to PRA but not identical.

This part of fetchmail was added in about 1996 and has been in daily
use since then by thousands of users all over the world during the
past eight years.  The fetchmail code is under the GPL, but as far as
I know, nobody claims proprietary rights over its algorithms.  Eric
Raymond, whose name is on the 2001 copyright notice in the file,
doesn't.  Since it was published so long ago it's unlikely that any
claims are yet to surface.

It checks headers in the following order:

Return-Path:  (envelope MAIL FROM address, as noted in code below)
Resent-Sender:
Sender:
Resent-From:
From:
Reply-To:
Apparently-From:

My proposal is to use this scheme to determine the address to check,
perhaps deleting the Apparently-From: check since it's not a header
that's ever appeared in an RFC.  One can argue whether this is the
best possible set and order of headers to check, but since it's worked
in practice for so long, I see no reason to mess with it.

If there's interest in this, I'd be happy to help write it up as an
I-D.  Personally, I don't think that any sort of sender domain vs. IP
address check is particularly useful as an anti-spam or anti-forgery
measure, but since MARID seems determined to publish something along
those lines, it might as well use the best proven technology
available.

Regards,
John Levine, johnl(_at_)iecc(_dot_)com, Primary Perpetrator of "The Internet 
for Dummies",
Information Superhighwayman wanna-be, http://www.johnlevine.com, Mayor
"I shook hands with Senators Dole and Inouye," said Tom, disarmingly.




--- lines 918-946 of transact.c from fetchmail 6.2.0 ---
    /*
     * If there is a Return-Path address on the message, this was
     * almost certainly the MAIL FROM address given the originating
     * sendmail.  This is the best thing to use for logging the
     * message origin (it sets up the right behavior for bounces and
     * mailing lists).  Otherwise, fall down to the next available 
     * envelope address (which is the most probable real sender).
     * *** The order is important! ***
     * This is especially useful when receiving mailing list
     * messages in multidrop mode.  if a local address doesn't
     * exist, the bounce message won't be returned blindly to the 
     * author or to the list itself but rather to the list manager
     * (ex: specified by "Sender:") which is much less annoying.  This 
     * is true for most mailing list packages.
     */
    if( !msgblk.return_path[0] ){
        char *ap = NULL;
        if (resent_sender_offs >= 0 && (ap = nxtaddr(msgblk.headers + 
resent_sender_offs)));
        else if (sender_offs >= 0 && (ap = nxtaddr(msgblk.headers + 
sender_offs)));
        else if (resent_from_offs >= 0 && (ap = nxtaddr(msgblk.headers + 
resent_from_offs)));
        else if (from_offs >= 0 && (ap = nxtaddr(msgblk.headers + from_offs)));
        else if (reply_to_offs >= 0 && (ap = nxtaddr(msgblk.headers + 
reply_to_offs)));
        else if (app_from_offs >= 0 && (ap = nxtaddr(msgblk.headers + 
app_from_offs)));
        /* multi-line MAIL FROM addresses confuse SMTP terribly */
        if (ap && !strchr(ap, '\n')) {
            strncpy(msgblk.return_path, ap, sizeof(msgblk.return_path));
            msgblk.return_path[sizeof(msgblk.return_path)-1] = '\0';
        }
    }