Hi Folks,
It has been quite some time since my last question here! I
have been quite content with my anti pharmaceutical SPAM filter which
has been running for quite some time (evolving slowly), until catching
on a false positive. Here is an excerpt of the script (the
expressions have been simplified):
bs = "\\"
spc = "([-.,@#$%&*()+=|:;!^_/1$bs])?"
a = "([a(_at_)]|/$bs)"
i = "([íi|1le]|ii)"
l = "[l1]"
o = "[o0]"
v = "(v|$bs/)"
valium = "${v}${spc}${a}${spc}${l}"
pharm = "${valium}"
:0:spool/junk-filt.lock
*$ 9876543210^0 ^From:.*\/(${pharm})
| formail -i "X-junk: $MATCH" >> spool/junk-filt
The data that causes the false positive is:
From: "Wilfred van den Waldenburg" <320048028299-0001(_at_)t-online(_dot_)de>
And the X-junk header gets set to:
X-junk: va
I am at a total loss as to how the above can match on only the first
two characters, and not on the entire spam-word "valium"!
Here is the log:
procmail: Assigning "bs=\"
procmail: Assigning "spc=([-.,@#$%&*()+=|:;!^_/1\])?"
procmail: Assigning "a=([a(_at_)]|/\)"
procmail: Assigning "i=([íi|1le]|ii)"
procmail: Assigning "l=[l1]"
procmail: Assigning "o=[o0]"
procmail: Assigning "v=(v|\/)"
procmail: Assigning
"valium=(v|\/)([-.,@#$%&*()+=|:;!^_/1\])?([a(_at_)]|/\)([-.,@#$%&*()+=|:;!^_/1\])?[l1]"
procmail: Assigning
"pharm=(v|\/)([-.,@#$%&*()+=|:;!^_/1\])?([a(_at_)]|/\)([-.,@#$%&*()+=|:;!^_/1\])?[l1]"
procmail: Invalid regexp
"^From:.*\/((v|\/)([-.,@#$%&*()+=|:;!^_/1\])?([a(_at_)]|/\)([-.,@#$%&*()+=|:;!^_/1\])?[l1])"
procmail: Assigning "MATCH="
procmail: Matched "va"
procmail: Score: 2147483647 2147483647
"^From:.*\/((v|\/)([-.,@#$%&*()+=|:;!^_/1\])?([a(_at_)]|/\)([-.,@#$%&*()+=|:;!^_/1\])?[l1])"
Surprisingly, removing $bs from $a:
a = "([a(_at_)]|/)"
clears up the "Invalid regexp"message, and clears up the false
positive:
procmail: Assigning "bs=\"
procmail: Assigning "spc=([-.,@#$%&*()+=|:;!^_/1\])?"
procmail: Assigning "a=([a(_at_)]|/)"
procmail: Assigning "i=([íi|1le]|ii)"
procmail: Assigning "l=[l1]"
procmail: Assigning "o=[o0]"
procmail: Assigning "v=(v|\/)"
procmail: Assigning
"valium=(v|\/)([-.,@#$%&*()+=|:;!^_/1\])?([a(_at_)]|/)([-.,@#$%&*()+=|:;!^_/1\])?[l1]"
procmail: Assigning
"pharm=(v|\/)([-.,@#$%&*()+=|:;!^_/1\])?([a(_at_)]|/)([-.,@#$%&*()+=|:;!^_/1\])?[l1]"
procmail: Score: 0 0
"^From:.*\/((v|\/)([-.,@#$%&*()+=|:;!^_/1\])?([a(_at_)]|/)([-.,@#$%&*()+=|:;!^_/1\])?[l1])"
I cannot see why the first one should fail! Is this a procmail bug?
As such the script will fail on "v/\lium"!
Testing \/alium, produces:
X-junk: /al
missing the leading backslash. Since it does match "\/"there seems to
be an anchoring problem in identifying the match.
This is procmail v3.22 2001/09/10.
Thanks in advance.
Best regards,
--Ralph
____________________________________________________________
procmail mailing list Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail