At 10:52 2002-12-11 -0700, Dave Cook did say:
I use the following recipe to filter out file attachments that my carry a
virus. For some reason, this also filters out file attachments of type .msg
and .html Does anyone know why this recipe is doing this?
Have you tried setting VERBOSE=ON and checking the procmail log? Have you
inspected the messages to ensure that there are not multiple attachments or
possibly matches WITHIN the content?
:0B
* ^[ \t]*name.*\.(vbs|exe|hta|scr|pif|js|bat|com|wma|chm)|\
^.*name.*\".*\.(exe|vbs|hta|scr|pif|js|bat|com|wma|chm)\"|\
^Content-.*\".*\.(hta|vbs|exe|scr|pif|js|bat|com|wma|chm)\"|\
^filename=.*\".*\.(hta|vbs|exe|scr|pif|js|bat|com|wma|chm)\"|\
^name=.*\".*\.(hta|vbs|exe|scr|pif|bat|mp3|com|wma|chm)\"|\
^name=.*.*\.(hta|vbs|exe|scr|pif|bat|mp3|com|wma|chm)|\
^name=*.\.(hta|vbs|exe|scr|pif|bat|mp3|wma|chm)|\
^.*name=.*\.(vbs|exe|hta|scr|pif|bat|mp3|wma|chm)|\
^filename=.*\"worms.zip\"
{
Note that if you're not doing anything else, bracing a no-condition recipe
doesn't buy you anything except for added complexity. Also, when writing
to a file, you should use a LOCKFILE (the trailing ':' on the flags line).
I'm not quite sure why you omit some extensions from the condition for some
of the conditions. I suspect the erratic use of extensions stems from
having multiple lines with a bunch of extensions listed on them and you
might not be propogating the extensions to all of them.
I trust that you're using \t to represent a tab for providing your recipe
for review, even though procmail doesn't support that syntax - a hard tab
would actually exist in the rc file. As I indicate in my disclaimer (see
.sig), unless expressly indicated otherwise, any time you see a [ ]
in a recipe I've written, it can generally be assumed that it contains a
space and a hard tab (even if the tab is not rendered in your mail client
as such), because there's little logic to bracing a _single_ character, or
bracing multiple occurrences of the same character (since it defines a
class). If in fact, you use the \t beliveing that it works as in C and
Perl, you need to shake yourself of the habit.
Now, let me rewrite this, in an easier to read (and maintain) format. I'm
not going to go hog-wild about analysing it to compare each condition
against something which would actually be encountered. Re-ordering the
extensions provides us with the consistent portion of the extensions list:
^[ \t]*name.*\.(hta|vbs|exe|scr|pif|wma|chm|bat|com|js)|\
^.*name.*\".*\.(hta|vbs|exe|scr|pif|wma|chm|bat|com|js)\"|\
^Content-.*\".*\.(hta|vbs|exe|scr|pif|wma|chm|bat|com|js)\"|\
^filename=.*\".*\.(hta|vbs|exe|scr|pif|wma|chm|bat|com|js)\"|\
^name=.*\".*\.(hta|vbs|exe|scr|pif|wma|chm|bat|com|mp3)\"|\
^name=.*.*\.(hta|vbs|exe|scr|pif|wma|chm|bat|com|mp3)|\
^name=*.\.(hta|vbs|exe|scr|pif|wma|chm|bat|mp3)|\
^.*name=.*\.(hta|vbs|exe|scr|pif|wma|chm|bat|mp3)|\
^filename=.*\"worms.zip\"
Thus, the exact conditions dramatically shrink, even when we include them
all without redundant conditions removed:
EXTS="hta|vbs|exe|scr|pif|wma|chm|bat"
1 ^[ \t]*name.*\.(${EXTS}|com|js)|\
2 ^.*name.*\".*\.(${EXTS}|com|js)\"|\
3 ^Content-.*\".*\.(${EXTS}|com|js)\"|\
4 ^filename=.*\".*\.(${EXTS}|com|js)\"|\
5 ^name=.*\".*\.(${EXTS}|com|mp3)\"|\
6 ^name=.*.*\.(${EXTS}|com|mp3)|\
7 ^name=*.\.(${EXTS}|mp3)|\
8 ^.*name=.*\.(${EXTS}|mp3)|\
9 ^filename=.*\"worms.zip\"
(lines are numbered for below reference)
On the sixth line, I don't understand why you have a double ".*" Surely,
this is a typo? Also, the seventh line, where you have "=*.\." ? zero or
more equals, any char, then a dot? No, again, this appears to be a typo (.
and * reversed). Correcting these two lines makes the two expressions
overlap - the first one encompases everything which the second would match,
making the second one unnecessary. Then, on line 8, that expression ALSO
overlaps (excepting that it wouldn't include .com - I'm not sure why you
wouldn't want that executable extension included - I suspect any difference
in extensions may be an oversight based on the jumble of expressions you're
using, which is why placing the common extensions into a variable serves to
simplify the expression so much) - zero or more of anything in front of the
expression would match when there's nothing in front of the expression,
thus, lines 6 and 7 can be removed and line 8 have "com" added to the
extensions list (assuming that you didn't exclude it from that condition
for a reason, otherwise, retain line 6).
I'm not really sure why you include mp3 in your executable extension
list. I'll assume you have a reason. Similarly, I don't know why you omit
.com and .js from some of the conditions.
Lines 3 and 4 can be combined easily. Line 9 should have that dot escaped.
Lines examining for a quoted filename probably are _really_ expecting zero
or more WHITESPACE characters preceeding the opening quote, not zero of
more of *ANYTHING*.
More consolodation could be performed, though not understanding why you
have different extension criteria prevents me from doing that effectively
without brutalizing your logic. Also, quoted strings versus nonquoted
strings _could_ be handled with a simple conditional such as (\"|), but
that doesn't ensure that for any OPENING quote that there MUST be a closing
quote, since each quote would be independantly optional rather than handled
as a pair. For your purposes, this might not be critical.
After cleanup, what I'm left with is:
EXTS="hta|vbs|exe|scr|pif|wma|chm|bat"
:0B:
* $ ^[ \t]*name.*\.(${EXTS}|com|js)|\
^.*name[ ]\".*\.(${EXTS}|com|js)\"|\
^(Content-|filename=).*[ ]*\".*\.(${EXTS}|com|js)\"|\
^name=[ ]\".*\.(${EXTS}|com|mp3)\"|\
^.*name=.*\.(${EXTS}|com|mp3)|\
^filename=[ ]\"worms\.zip\"
viruscontrol
This probably doesn't do a thing for your mismatch problem, but should make
some of the matches somewhat more "correct", and the expression isn't
nearly as wasteful.
I then ran the above filter against a testbox of spam, and sure enough, it
matched several HTML spams, each of which included embedded forms of the type:
<input type="hidden" name="fromemail" value="me(_at_)me(_dot_)com">
<input type="hidden" name="from" value="me(_at_)me(_dot_)com">
If you missed it, that sort of thing is matched by the second condition of
your original filter:
^.*name.*\".*\.(exe|vbs|hta|scr|pif|js|bat|com|wma|chm)\"|\
I'd seriously rethink preceeding any expression like that with '.*', and
evaluating what lead you to include that. Perhaps:
^[ ]*(file|)name[ ]*(=|)[ ]*(\"|).*\.($EXTS)\>|\
might do the trick (that rolls SEVERAL of your original conditions into
one, and also requires that the _final_ extension be trailed by a wordbreak
character). The above expression is untested however.
---
Sean B. Straw / Professional Software Engineering
Procmail disclaimer: <http://www.professional.org/procmail/disclaimer.html>
Please DO NOT carbon me on list replies. I'll get my copy from the list.
_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail