procmail
[Top] [All Lists]

RE: Spam filter anomaly

2005-09-02 09:00:38
Lous, your web page at
http://www.columbia.edu/~lnp3/PennyStocks.htm
didn't show the original e-mail message, complete with 
headers.  It showed the human-readable form.  If you'll 
re-post the actual text of the e-mail in question, complete 
with headers, it'd be helpful.

I reset my panix account to use imap, so that I will be able to retain the
spam for closer examination. I thought that I was dealing with
run-of-the-mill email, not base64 encoding.


If you go to the archives,
http://www.xray.mpe.mpg.de/mailing-lists/procmail/
and enter "base64", "reformime", or "munpack" as a search 
string you may find some ideas on how to solve your problem 
of dealing with encoded e-mails.

Thanks. The information below is very useful. I have been a programmer since
1968 and have been working in perl for over 13 years now. I don't want to
hurt anybody's feelings but procmail seems counter-intuitive. For simple
filtering of plain-vanilla spam and for piping to a majordomo archive that I
am responsible for, there's no problem. But when I look at some of the more
advanced recipes on this mailing list or on various websites, I have to
scratch my head. I just ordered "The Procmail Companion" by Martin Mccarthy
and will try to get up to speed.


You can enhance your script that scans the body of base64 
messages to look for (and presumably score) hits on 
particular encodings.  Here's a quick/easy way to get the 
encodings of "st0cks" for example.

echo -n 'st0cks' | perl -MMIME::Base64 -0777 -ne 'print 
encode_base64($_)'

which prints: c3QwY2tz

You could add something like:

# We have a base64 encoded body.
# Look for at least 2 hits on base64 encoded spam words # The 
encodings below are for:
# Penny-stocks, Penny stocks, st0ck, St0ck respectively.
#
:0 B
* ^Content-Type: text/html
* ^Content-Transfer-Encoding: base64
* -2^0
*  1^1  (UGVubnktc3RvY2tz|\
              UGVubnkgc3RvY2tz|\
              c3QwY2s=|\
              U3QwY2s=)
/users/lnp3/mail/base64-spam

Note that in your checking for Content-Type and 
Content-Transfer-Encoding above, you check for only one space 
after the field name and you expect a particular form of that 
line.  There are probably other MIME compliant encodings of 
those MIME descriptors that a mail client will honor, but 
will make it past your filter.  Spammers will take advantage 
of such things.

(I didn't test the example above.  Hopefully, others here 
will correct any errors they see.)

Spamassassin will let you add your own tests against either 
the raw message or the decoded message.  Check out 
RulesDuJour as one method of enhancing your SA experience:
http://www.exit0.us/index.php?pagename=RulesDuJour
and an example of custom body rules (note these are applied 
to the body after it is decoded by SA) 
http://www.exit0.us/index.php?pagename=BodyRules




____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail



____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>