procmail
[Top] [All Lists]

Re: How to Not Scan attachments when filtering

2002-04-09 08:19:52
On  8 Apr, Rick Wagner wrote:
| 
| Resend No Answer last time.
| Can anyone make suggestions?
| 
| Subject: How to Not Scan attachments when filtering
| 
| I have procmail working just fine on my system. I have a text file with
| one word per line called .bodywords
| The problem is occasionally I will receive an attachment for example a
| jpeg and the filter reads the decoded jpeg file and if it happens to
| match random text in the .bodywords file it is matched and sent to my
| spam folder.
| 
| Is there a way to modify this filter to NOT read the decoded text of an
| attachment? 
| 
| SHELL=/bin/sh
| PATH=/usr/local/bin:/usr/contrib/bin:/usr/bin:/bin:/usr/sbin
| MAILDIR=$HOME/mail  
| DEFAULT=/var/mail/$LOGNAME
| LOGFILE=$MAILDIR/log.`date +%m%d`
| VERBOSE=yes
| SPAM=suspected_spam.`date +%m%d`
| LOCKFILE=lock
| COMSAT=no
| BODYWORDS=.bodywords
| 
| :0 B
| * ? (formail | fgrep -iqf $BODYWORDS)
| $SPAM   
| 
| [...]

There's at least a couple of general ways you could go about this.

1. Add a recipe before this one to deliver messages with attachments
acceptable to you, and another to quarantine messages with attachments
unacceptable to you. That way neither type gets as far as this recipe.

2. Add condition(s) to this one to exclude all but text (and probably
html) attachments.

3. Add a size condition (e.g. "* < 32768") to prevent this recipe from
running on any messages greater than some threshold.

I do neither of the first two, so have no examples, but the list
archives at <http://www.xray.mpe.mpg.de/mailing-lists/procmail> are
replete with working examples of recipes to process attachments. The
third is less than perfect, but may be better than nothing until you
get something better working. 

Unsolicited advice...

I might be missing something, but piping the body to formail seems
nonsensical.  Is there some reason you're not simply doing:

:0 B:
* ? fgrep -iqf $BODYWORDS
$SPAM

I've also added a lock to this recipe with the trailing ":".
 
There are ways to extract the time stamp from the envelope without
forking date. Feel free to ignore everything beyond the next couple
lines, but at least consider eliminating the second execution of the
date program.

LOGFILE=`date +%m%d`
SPAM=suspected_spam.$LOGFILE
LOGFILE=$MAILDIR/log.$LOGFILE
# assuming your VERBOSE=yes is located to log this assignement
LOG="Assigning SPAM=$SPAM
"
VERBOSE=yes

Arguably, better still...

The envelope format may be system/MTA/otherwise dependent, so treat this
as a concept and not bullet-proof. (This is somewhat stripped down from
something I use to set about a half dozen time stamp related variables
in different formats.) As usual, the "[         ]" character classes
include a <space> and a <tab>.

My personal preference us for variable assignments to avoid long
wrapped conditions, especially if the variables can be re-used. Others
may think differently.

WEEKDAYS='(S(un|at)|Mon|T(ue|hu)|Wed|Fri)'
MONTHS='(J(an|u[ln])|Feb|Ma[ry]|A(pr|ug)|Sep|Oct|Nov|Dec)'
MONTH2NUM="Jan:01:Feb:02:Mar:03:Apr:04:May:05:Jun:06:Jul:07:Aug:08:\
Sep:09:Oct:10:Nov:11:Dec:12"
ENVELOPE_STAMP="$WEEKDAYS[      ]+$MONTHS[      ]+([ 0][1-9]|[12]\
[0-9]|3[01])[   ]+([01][0-9]|2[0-3]):[0-5][0-9]:[0-5][0-9][     ]+\
(19|20)[0-9][0-9]"

:0
* $ ^^From[     ]+.*\/$ENVELOPE_STAMP
{
  E_DATE = "$MATCH"
  :0
  * $ E_DATE ?? ^^$WEEKDAYS[    ]+\/$MONTHS
  * $ MONTH2NUM ?? $MATCH:\/..
  { xMM = "$MATCH" }
  :0
  * $ E_DATE ?? ^^$WEEKDAYS[    ]+$MONTHS[      ]+\/[0-9]+
  {
    :0
    * MATCH ?? ^^[0-9]^^
    { xDD = 0$MATCH }
    :0 E
    { xDD = $MATCH }
  }
  :0
  * xMM ?? ^^(0[1-9]|1[0-2])^^
  * xDD ?? ^^(0[1-9]|[12][0-9]|3[01])^^
  { MMDD = "$xMM$xDD" }
  xMM  xDD
}

# just in case
:0
* MMDD ?? ^^^^
{ MMDD=`date +%m%d` }

LOGFILE=$MAILDIR/log.$MMDD
VERBOSE=yes
SPAM=suspected_spam.$MMDD

-- 
Reply to list please, or append "6" to "procmail" in address if you must.
Spammers' unrelenting address harvesting forces me to this...reluctantly.


_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>