procmail
[Top] [All Lists]

Re: Listserv recipe for bounced messages?

2003-04-09 09:41:23
At 08:48 2003-04-09 -0400, Louis Proyect did say:
Unfortunately you can't simply reply to the bounce because it is addressed owner-mylist(_at_)mylist(_dot_)lists(_dot_)

Configure your MTA to insert envelope data into the message headers. X-Envelope-From: most notably. The MTA is actually where the sender gets tweaked to be owner-listname, because an owner-listname alias was found corresponding to listname.

The sender's email address can be extracted from the bounced message, but it has to be distinguished from the list owner's email address that becomes part of the bounce.

Unnecessary if you utilize the X-Envelope-From.

I know that all this seems complicated, but hope that somebody else has figured it out. Some of the bounces are spam, but others are from subscribers who post in html, use attachments, exceed the limit, etc.

You're referring to submission bounces (generated by the list server), not delivery bounces.

I've written an extensive procmail front end for MAJORDOMO (note emphasis) which manages the following sort of things:

        * centralized taboo and troll filters (with logic - such as "test"
                messages rather than simply tripping on say, "test" in the
                subject, because the majordomo taboo filters are too simple
                and don't allow for logic such as booleans and weighting).
                trolls insert an explanation header into the advisory which
                is sent to the admin, so the admin can see why a message
                was rejected (for instance, the shuttle tragedy earlier
                this year was completely OT for my lists, so I composed
                several regexps to blintz that using scoring).
        * attachment rejection, with explanations (separate for AOL html
                mail, HTML mail, vc*ard, PGP, and generic file attachments)
        * oversize message rejection
        * overquoting (with lots of logic and control)
        * digest resubmission checking (at a minimum, checking the subject
                for digest subject tokens - hosers who reply to digests and
                don't set the subject appropriateley are a PITA).
        * support for stripping freemail advertising footers
        * support for condensing large runs of blank lines
        * stripping of "receipt requested" (various formats), and explanation
                to the user who sent the message.
        * filtering for executable attachments (body is discarded, headers
                only are forwarded to the admin, which spares them from
                virus submissions - this is BEFORE the generic attachment
                checks - no need to send a message to the sender when it is
                unlikely that a human actually sent it, and the From:/Sender
                are forged on the current crop of viruses)
        * loop detection (raises alarm to listadmin, doesn't permit message
                to be resubmitted).  You know - those idiots running some
                cheezeball NT mail package that looks at the To: (which is
                almost invariably the list), and then resubmits the message
                to that address, with all of the original headers.  Once I
                put this filter went into place on the lists I'm involved
                with the administration of, the occasional loop condition
                was totally handled - no more phantom resubmissions.
        * backups (messages submitted to the lists are backed up in an
                archive file, including a header with the actual submission
                line, so that messages can be reprocessed through a script
                later if necessary.

Other things in development:
        * crosspost rejection (both cleartext crossposts and based on
                messageid) to limit crossposting on multi-list sites (where
                the lists are interrelated, and there's the occasional nit
                who insists on posting their message to multiple lists).
        * repost rejection (based on message CRC/MD5/etc) - identify messages
                which have been posted to the list(s) in the recent past and
                refuse to repost them.  Admins can bypass this, but its chief
                reason is to deal with people who have the habit of resending
                something because they think they didn't send it the first
                time because they didn't receive a copy instantaniously.
        * subscription probing.  Majordomo doesn't send messages with unique
                envelope senders (encoded to the recipient id), so sometimes,
                users with forwarding services, or just plain crappy ISPs
                generate bounces from which the actual subscriber cannot be
                ascertained.  With this filter, I hope to be able to send
                subscription probes either to the entire subscriber list, or
                just to a select group, and _automatically_ identify
                problem addresses.

The above filters, running on the submission side of the list (BEFORE passing the message to majordomo) works in conjunction with a sister set of filters which are installed on the owner-listname aliases (in our case, right on the list host, and this filter in turn forwards the results to the individual admins):

        * truncate digests (you _really_ don't want to receive all the bounced
                digest bodies on large, traffic-intensive lists)
        * other bounce cleanup (AOL "braindead" explanation block at the top
                of their bounces, some noise in majordomo/archiving stuff)
        * bounce categorization (MBFULL, MAILBLOCK, HARDBOUNCE, etc) based on
                characteristics of delivery bounces
        * non-subscriber explanations (majordomo kicks the non-sub bounce to
                the listowner).
        * virus filtering (as above).
        * non-matching subscription auth message (someone subscribed with
                one address, but the auth they reply with isn't sent from
                the same address - it's a sure thing that if they can't
                auth from the same actual address, they'll never manage to
                post to the list, so it should be addressed up front - the
                user is notified that they should subscribe from an address
                they can actually post from).
        * centralized discarding of bounces matching certain criteria, which
                I call "EVILBOUNCE".  Stuff like the dreaded "5 day bounce",
                where even if the listadmin uns*bs the user when they get
                the first bounce, they'll still receive bounces for the next
                5 days, one for each message which was delivered to that
                subscriber.  This isn't automated - it's a filter which can be
                edited to deal with certain messages.  When @HOME went TU a
                year ago, this was extremely handy, since the bounces hammered
                many admins with hundreds of messages for several days - this
                filter allowed those messages to be spotted and discarded
                without each of the admins having to do anything.

Individual filters can be composed as needed for specific lists (the LISTNAME is in a variable which the filters can match against), but the above handles most of the list administrivia. The submission side filters (before majordomo) are called "Seneschal", and could be modified to work with almost any listserv package which is invoked from a mail alias prog definition. Many of the listowner filters are also generic, though several are majordomo-specific (nonmember and subnonmatch for instance).

Each list which is managed through these filters can have different sets of options enabled (ALL of the above are handled as options - even the overquoting stuff allows the parameters to be defined - each list is represented by a single line in a configuration text file which the filters extract and then use to match config tokens within. On our servers, all the lists have at least some options (such as VIRUS) globally enabled because it's the right thing to do.

Gotchas:
* to install, you need to be an admin on the server, because it involves changing the aliases for the lists and the listowners.

* currently, the per-list options are managed within the filters themselves, rather than say, extracted from the majordomo config. That is to say, the admin charged with managing the filters needs to be the one changing the per-list options for other admins.

* (the big one) - this isn't polished up for primetime. For instance, the bounce message templates have text and URLs specific to the host and lists where the filters are being used, rather than having tokens which are replaced at runtime. Right now, I don't have the time to deal with updating them to do this, and I won't be releasing the filter set until I do. While I started work on the filters less than 16 months ago, put them into regular service about 13 months ago, and have been fine-tuning them periodically since, the list filters have been active on a couple dozen discussion lists, with circa 100,000 _submitted_ messages to the discussion lists (not including the listowner handling of bounces, which is even higher than that). They significantly reduced the workload for admins who now do not need to explain to users why their posts never made it to the discussion list - a plain english advisory with pointers on how to correct their problem was automatically mailed to the user. I can't give an estimate on when this will be available for public consumption as a package - I need to flesh a few things out more, and make the helptext use replaceable tokens.

I want to send out a generalized message telling the bouncers that if they are spammers to get lost and if they are subscribers to check posting rules on our website.

The MAJORDOMO bounce for nonmember submission contains the email address in the body. That is easy to parse for.

From within an rcfile running on the listowner mail, you can do the following (this is all detached code, extracted in fragments from the filter set I describe above):


:0
* $ ^TO_(owner-$LISTNAME(-digest)?|$LISTNAME(-digest)?-approval)@$\OURDOMAIN
* ^Subject:[    ]*BOUNCE
{
        # (note, this braced construct actually checks for a number of other
        # bounces directed to the listowner, but I'm only presenting the
        # code for the nonmember one).

        # need to extract the offender's address from the message contained
        # in the BODY
        :0bi
        REPLYTO=|sed -e '1d'|formail -b -rtzxTo:

        FILTER_ID="NONMEMBER"

        # nonmember submission
        # note that this DOES NOT include "Re:", though the bounce
        # notification would not have this same subject anyway...
        :0
        * $ FILTER_OPTIONS ?? [         ]$FILTER_ID\>
* $ ^Subject:[ ]*BOUNCE $LISTNAME(-digest)?(_at_)$\OURDOMAIN:[ ]*.*Non-member submission from
        {
                BOUNCEMSG=nonmember.msg
                BOUNCESUBJ="Non-member list posting rejected"
                BODYBOUNCE=YES

                INCLUDERC=$PMDIR/bouncer.rc
        }
}




FILTER_OPTIONS is a string containing the options which were extracted for the given list from an options file. You could simply manually set it like so:

FILTER_OPTIONS="mylistid OPTION1 OPTION2 OPTION3 SUBNONMATCH"


LISTNAME is something extracted as a parameter to the invoked procmail in the aliases (though in _most_ cases, it can be discerned through some parsing logic). You could put the above in an rcfile which you include, then:

OURDOMAIN="your_mail_host.and_domain.tld"
LISTNAME="mylistid"
FILTER_OPTIONS="mylistid OPTION1 OPTION2 OPTION3 SUBNONMATCH"
INCLUDERC nonmember.rc

and repeat for each of the different lists you manage. Note though that as written, the domain portion still contains reference to the specific mail host. I run these filters ON the mail host where the lists are handled, not in the account of the admin (possibly elsewhere) that manages that list.


the bouncer.rc script referenced above:

#---------------------------------------------------------------
# bouncer.rc
# 20020310/1036 SBS     Created - centralized bounce processing code.
# 20020317/1958 SBS     Strips the topmost line from the bounced message
#                       (this is a useless and misleading From_)
# 20020317/2112 SBS     Added optional "bodybounce" flag
# 20020405/1357 SBS     Added option for whether bounce is copied to admin.
# 20020406/1006 SBS     Added BOUNCENOTES header
# 20020406/1012 SBS     Use BODYBOUNCE for flag - simplifies
# 20020412/1237 SBS     fail bounces to localhost addr
# 20021024/1116 SBS     Add local archiving of bounced ORIGINAL messages.
# 20030207/1638 SBS     Changed X-Loop to _additive_ header (existing X-Loop
#                       is preserved).
#

# This script is intended to be INCLUDERC'd into the handler for different
# bounce conditions.  This simplifies the other scripts and ensures a
# consistent bounce action.

# This is NOT used by the looptest filter, which has different procedures
# for sending bounces.  Also, trashmouth doesn't use this code.

# If the reply address isn't something to avoid, then
# bounce the submission to it.  We expect the following variables:

# BOUNCEMSG = message file for bounce message (just the filename, we
#       expect to find it in $AUTOREPLY/)
# BOUNCESUBJ = subject text for the bounce message.
# BODYBOUNCE = if non-null, this causes the BODY of the message to be
#       bounced, but not the outer message (usually used for admin bounces
#       such as nonmember submissions)
# BOUNCENOTES = (optional) if non-null, the text is added to the header
#       of the bounced message as X-Seneschal-Notes:

LOG="BOUNCE: $FILTER_ID - $BOUNCESUBJ
BOUNCE: To: $REPLYTO
BOUNCE: List: $LISTNAME$NL"

# defines whether the admin should receive copies of the advisories sent
# to members.
:0
* $ FILTER_OPTIONS ?? [ ]BCCADVISE\>
{
        BCCADVISE=TRUE
}

# Make a backup of the message (so filter admin can check things that other
# listadmins may not convey completely).
:0c:
| formail -I "X-majordomo-delivery: $DELIVERY" -I "X-seneschal-bounced: $BOUNCESUBJ" >> /etc/procmailrcs/bounced.mbx

# if BODYBOUNCE isn't empty, set the option to 'b' for processing the BODY
:0
* ! BODYBOUNCE ?? ^^^^
{
        BODYBOUNCE=b
}

:0$BODYBOUNCE
* $! REPLYTO ?? $\OURDOMAIN
* $! REPLYTO ?? @(127\.0\.0\.1|localhost)\>
* $! REPLYTO ?? ^[      ]*(mailer-daemon|postmaster)@
| (sed -e '1d' | cat $AUTOREPLY/$BOUNCEMSG - | \
        formail -I "Subject: [BOUNCE ADVISORY] $BOUNCESUBJ" \
        -I "To: $REPLYTO" ${BCCADVISE:+-I "Bcc: $BOUNCER"} \
        -A "X-Loop: $LOOPALERT" \
        ${BOUNCENOTES:+-I "X-Seneschal-Notes: $BOUNCENOTES"} \
        -I "From: $BOUNCER") | $SENDMAIL -t -f $BOUNCER

LOG="BOUNCE: Sending BOUNCE FAILURE to listowner$NL"

# Execution will fall here only if preceeding wasn't run
# (due to issue with reply address)
# encapsulate the entire original message (headers and all) into
# the body of a new message, with an alert subject.
:0$BODYBOUNCE
| (sed -e '1d' | cat $AUTOREPLY/$BOUNCEMSG - | \
        formail -I "Subject: [BOUNCE FAILURE] $BOUNCESUBJ: $REPLYTO" \
        -A "X-Loop: $LOOPALERT" \
        -I "To: $BOUNCER" \
        ${BOUNCENOTES:+-I "X-Seneschal-Notes: $BOUNCENOTES"} \
        -I "From: $BOUNCER") | $SENDMAIL -f mailer-daemon(_at_)$OURDOMAIN 
$BOUNCER

#---------------------------------------------------------------





BOUNCER is separatley defined to owner-$LISTNAME(_at_)$OURDOMAIN, LOOPALERT is an address for sending mail loop alerts to (used in some other filters). BCCADVISE is extracted from the options, and defines whether the listadmin should be Bcc'd on each of the advisories sent by the list.

The bounce subject: ADVISORY means the user has been addressed, FAILURE means the listadmin is the only person addressed on the message, for whatever reason (for instance, VIRUS doesn't reply to the apparent sender, and in my case, the advisories are not sent to other addresses within the list domain itself as a baseline guard against malicious attempts at getting the system to loop upon itself).


The text message for the nonmember bounce includes "common reasons for this error include", outlining replies to crossposts, and users who may have been unsubscribed due to reocurring problems with delivery to their mail account.


Now, I've got to return to the work I was supposed to be doing this morning before a business call that I've got < 1 hour to go on. Run the above in a sandbox (see my .sig), and experiment by throwing saved listowner messages at it. I've probably omitted explanation of a few variables used here (they ARE documented in the filter set, but as I've extracted fragments...), but examination of the recipes should provide a sufficient explanation as to what they should be.

---
 Sean B. Straw / Professional Software Engineering

 Procmail disclaimer: <http://www.professional.org/procmail/disclaimer.html>
 Please DO NOT carbon me on list replies.  I'll get my copy from the list.


_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>