procmail
[Top] [All Lists]

Re: procmail spam identifying problem

2002-01-22 01:50:35

On Tue, Jan 22, 2002 at 12:12:17AM -0600, Jeff Lamb wrote:

My header identifying is working fine, but the body one isn't.

Variables:
FGREP=/bin/fgrep
SUBJECT=`/usr/local/bin/formail -x Subject:`
BODY='/usr/local/bin/formail -I ""'

The quick answer is that your quotes here are wrong.  ' instead of `.

REJECTS=$HOME/pass/header_rejects
BODY_REJECTS=$HOME/pass/body_rejects

where REJECTS and BODY_REJECTS point to lists of junkmail keywords

#This works fine:
:0E:
* ? (echo $SUBJECT | $FGREP -i -f $REJECTS)

#However this one NEVER matches.
:0E:
* ? (echo $BODY | $FGREP -i -f $BODY_REJECTS)

You should note that much of what you seem to want to do can be done
in ways that are less resource intensive.  For example:

 :0:
 * ^Subject \/.*
 * ? echo "$MATCH" | $FGREP -i -f $REJECTS
 spamfolder

will avoid launching formail merely to gather SUBJECT, and:

 :0 B:
 * ? $FGREP -i -f $BODY_REJECTS
 spamfolder

will avoid launching it to grab BODY, since a special condition of "?"
always pipes the message through the command line anyway.

(Come to think of it, I don't really know whether a B will affect what
data is piped to condition command lines.  David?  Sean?  Martin?  Is
it necessary to prepend the command line with "sed '1,/^$/d' |" to strip
off the header, or does B take care of that?)

But all things considered, on a heavy mail server it'll be *much* better
if you can turn your *_rejects files into procmail recipes themselves,
and avoid launching an external program or five for every single email
that comes in to your system.  Another thing you should probably try to
do is eliminate for consideration any email which you can say with
confidence *is* legitimate.  So you won't bother setting the variables
if the email matches the signature of a mailing list you're on, etc.
I some prototype list-matching code I've been working on which could
save you some time at http://www.it.ca/software/procmail-listid .

Also have a look at http://www.it.ca/software/procmail-spamtrap to see
what I'm doing for spam identification.  I have alot of my own body
content checks in there; I'd love to compare regexps if you're willing.

p


-- 
  Paul Chvostek                                             
<paul(_at_)it(_dot_)ca>
  Operations / Development / Abuse / Whatever       vox: +1 416 598-0000
  it.canada                                            http://www.it.ca/

_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail