procmail
[Top] [All Lists]

Re: Please help with questions about Procmail as a filter...

1998-05-18 18:24:20
At 04:43 PM 5/18/98 -0700, George Marshall wrote:

I've carboned you on this reply because I'm not sure if you're actually a
subscriber of the Procmail list or if you posted to it from externally.  If
you are susbcribed, please just say so, and I'll not cc you.

1.     SEND MAIL TO DEFAULT FILE WITH HEADER INFORMATION AS NORMAL,
      AND/OR GROUP FORWARD WITHOUT HEADERS

Well, the default is easy enough - don't filter, and it will end up in your
default mailbox intact.

As for forwarding without headers, While I don't use the following recipe,
I just scribbled it up in response to your situation, and it should work:

:0: $TEMP/$listname$LOCKEXT
* criteria1
* criteria2
{
        # Note that this should get rid of all extraneous headers, but in fact
        # with the exception of To:, you should remove the other -I headers
        # (not -iFrom: though), since they bear on the threading and
        # anti-looping of the message.
        :0c:
        | $FORMAIL -rkb -iFrom: -ITo: -IX-Loop: -IReferences: \
                -IIn-Reply-To: | $SENDMAIL addresses-to-forward-to

        # deliver an unmangled copy to yourself (full headers)
        :0:
        $DEFAULT
}

2.     FILTER MAIL FROM SELECTED SENDER OR HOST *EXCEPT* WHEN THE SUBJECT
      LINE CONTAINS SELECTED CHARACTERS

Can I simply alter my .procmailrc file so that mail from *(_at_)abc(_dot_)com, 
for 
example, will go to my default mail directory as normal, UNLESS the 
subject line contains offending characters, such as FREE OFFER or MONEY 
MAKER, in which case the mail would be sent to another designated file?

Well, you could filter on those phrases FIRST, then filter on the from
addresses.

Otherwise, using your criteria:

:0:
* ^FROM:(_dot_)*(_at_)abc\(_dot_)com
* !^Subject:\.*(FREE OFFER|MONEY MAKER)
abcfile

# Otherwise, if it is from and the previous didn't execute (E), then it
# must also match one of the exception conditions above.  Note that this
# is different than simply saying "does the junk text match", because the
# from may not be the same.  OTOH, since we know we checked against NOT the
# junk text above, matching the simplest term here should match what wasn't
# matched above - the junk
:0E:
* ^FROM:(_dot_)*(_at_)abc\(_dot_)com
abcjunk

(I'd still write it as catch junk generically, THEN filter for specific
addresses).


3.     AUTOMATIC FILTER TO DESIGNATED FILE *PLUS* RETURN TO SENDER'S ISP
      WITH A GENERIC MESSAGE FROM ME?

I would discourage this.  The reason being that a majority of spam is
forged, and those that are not generally are using the SPAM ISP, so your
complaints either fall on deaf ears, or end up confirming your address as
working.

But, yes, you can forward offending messages into a folder, and if you
wanted, use formail to generate a return message:

$FORMAIL is the path to the formail executable, $FGREP is the path to GNU
fgrep, $TWITLIST is the path to a raw line-oriented list of twits
(similarily, I have a recipe for domains and subject line keywords),
$TWITVER is a text message defining the version of this file (appears in
the logs), $MAILDIR is the path to the directory where my sorted mail files
appear. $MAILBOT is a mailbox address alias I use (it is caught or sunk
elsewhere to identify spam replies), $AUTOREPLY is the path to the
autoreply texts, and $SENDMAIL is the path to the sendmail executable.  Any
of these can be simply hardcoded in your case.  The autorepy function here
simply replys to the sender, and does not include the original text (though
it does maintain the subjectline).  Search the procmail archives for a
similar filter to reply the original body, with inkected comment text at
the top.  You'll have to figure out your own logic for WHO to reply to if
you're sending it to the ISP as a complaint - the variety of addresses used
for abuse reporting (and often the complete lack of such an address)
ensures that this will be a big task.


# If a SPECIFIC ADDRESS from the twitlist appears anywhere within the
# headers, minus subject, and addressees, toss it.

#* $? $FORMAIL -ISubject: -ITo: -ICc: -IResent-To: -IResent-Cc: | $FGREP -i
-f $TWITLIST
:0
* $? $FORMAIL -ISubject: | $FGREP -i -f $TWITLIST
{
        LOG="SPAM: Match against twitlist$TWITVER"

        :0:
        |gzip -9fc>>$MAILDIR/twits.gz
}

# The following recipe can be used to auto-reply a twit message
# to the sender (provided that it is a valid address).
#
# To enable this, add the 'c' flag to the preceeding recipe (otherwise, this
# won't execute at all).  Arguably, if the preceeding recipe sends the
# message to /dev/null then you can simply concatenate the action lines of
# this recipe with the above recipe, replacing the /dev/null action, and
# eliminating the need for a 'c' on that recipe.
#
# Note that I don't have this enabled - spammers simply don't read this
# stuff, so why should I waste my bandwidth sending it (and probably getting
# a bounced-back return)?

:0 Aw
  | ( $FORMAIL -rt -I "Precedence: notification" -I "From: $MAILBOT" ;\
   cat $AUTOREPLY/twit.msg ) | $SENDMAIL -t

# ==========================================================================



4.     THE PURPOSE OF THE SECOND COLON (:) IN THE .PROCMAILRC RECIPES

What is the purpose of the second (:) ? My Internet Service Provider
suggests that my recipes should include a second colon (:) in the first
line, like this:

:0:
* ^TO(_dot_)*newnet-list(_at_)(_dot_)*eskimo(_dot_)com
nn-list

When delivering to a folder, you want to use file locking, so that another
recipe doesn't come along and attempt to modify that file at the same time
(remember, MULTIPLE copies of procmail may be running on your behalf if you
get several messages in your inbox).  It is not necessary on program
executions or forwarding to other addresses (which is essentially a program
execution of sendmail or somesuch).

However, the same recipe in my .procmailrc file would look like this:

:0
^TO(_dot_)*newnet-list(_at_)(_dot_)*eskimo(_dot_)com
nn-list

Should have the locking colon.

And the only recipes I use with the second (:) are like this:

:1Hw:
^Subject.*\$\$
/dev/null

Uneccessary here, since /dev/null doesn't need locking.

5.     DIRECTING MAIL THAT IS ADDRESSED TO ME GENERICALLY RATHER THAN
      SPECIFICALLY

Occasionally I receive junk e-mail that is not addressed to me 
specifically, but instead is addressed to something like 
friend(_at_)myhost(_dot_)com(_dot_) Currently, the way I deal with this is:

If you're only getting this stuff occasionally, you're doing well, or just
haven't been around long enough to get placed on a bunch of lists.

You're going to run into problems with mailing lists which deliver using
BCC.  The workaround is to filter for known mailing lists BEFORE doing
something broad like redirecting something that simply isn't addressed to
you.  However, the following generally should do the trick:

:0:
* ! ^TO:georgem(_at_)eskimo\(_dot_)com
$MAILDIR/junkfile

The ! says "INVERT this condition" so instead of matching "yes, this is
addressed TO George," it says "this is not addressed to George."

SUGGESTION: when experimenting, try dumping messages into files rather than
dev/null - if you screw some logic up, you won't end up sending them to the
ether.

However, I would like to make a recipe that is more specific, to say that
if my actual e-mail address is not either the addressee, the CC or BCC
addressee of the e-mail, the mail is then deposited in the alternate

Got news for you, if you got a message and you weren't TO or CC, then you
are BCC.  And this would be more "general" than "specific", as specifying a
specific address (like friend(_at_)public(_dot_)com) would be.


6.     THE MYSTERY OF THE LEADING *ASTERISK*

Asterisks preceed ALL recipe *CONDITION* lines.  That is, the lines which
determine if the recipe *ACTION* is supposed to take place.

My ISP says this leading (*) may not be required for Procmail to
function, but that it may be required with future updated versions of

Whoever told you that is apparently not a user of procmail.


---
 Please DO NOT carbon me on list replies.  I'll get my copy from the list.

 Sean B. Straw / Professional Software Engineering
 Post Box 2395 / San Rafael, CA  94912-2395