procmail
[Top] [All Lists]

match problem or procmail bug: anti pharmaceutical SPAM filter

2005-05-12 05:49:53
Hi Folks,

        It has been quite some time since my last question here!  I
have been quite content with my anti pharmaceutical SPAM filter which
has been running for quite some time (evolving slowly), until catching
on a false positive.  Here is an excerpt of the script (the
expressions have been simplified):

bs      = "\\"
spc = "([-.,@#$%&*()+=|:;!^_/1$bs])?"
a       = "([a(_at_)]|/$bs)"
i       = "([íi|1le]|ii)"
l       = "[l1]"
o       = "[o0]"
v       = "(v|$bs/)"

valium  = "${v}${spc}${a}${spc}${l}"

pharm   = "${valium}"

:0:spool/junk-filt.lock
*$ 9876543210^0 ^From:.*\/(${pharm})    
| formail -i "X-junk: $MATCH" >> spool/junk-filt

The data that causes the false positive is:

From: "Wilfred van den Waldenburg" <320048028299-0001(_at_)t-online(_dot_)de>

And the X-junk header gets set to:

X-junk: va

I am at a total loss as to how the above can match on only the first
two characters, and not on the entire spam-word "valium"!

Here is the log:

procmail: Assigning "bs=\"
procmail: Assigning "spc=([-.,@#$%&*()+=|:;!^_/1\])?"
procmail: Assigning "a=([a(_at_)]|/\)"
procmail: Assigning "i=([íi|1le]|ii)"
procmail: Assigning "l=[l1]"
procmail: Assigning "o=[o0]"
procmail: Assigning "v=(v|\/)"
procmail: Assigning 
"valium=(v|\/)([-.,@#$%&*()+=|:;!^_/1\])?([a(_at_)]|/\)([-.,@#$%&*()+=|:;!^_/1\])?[l1]"
procmail: Assigning 
"pharm=(v|\/)([-.,@#$%&*()+=|:;!^_/1\])?([a(_at_)]|/\)([-.,@#$%&*()+=|:;!^_/1\])?[l1]"
procmail: Invalid regexp 
"^From:.*\/((v|\/)([-.,@#$%&*()+=|:;!^_/1\])?([a(_at_)]|/\)([-.,@#$%&*()+=|:;!^_/1\])?[l1])"
procmail: Assigning "MATCH="
procmail: Matched "va"
procmail: Score: 2147483647 2147483647 
"^From:.*\/((v|\/)([-.,@#$%&*()+=|:;!^_/1\])?([a(_at_)]|/\)([-.,@#$%&*()+=|:;!^_/1\])?[l1])"

Surprisingly, removing $bs from $a:

a       = "([a(_at_)]|/)"

clears up the "Invalid regexp"message, and clears up the false
positive:

procmail: Assigning "bs=\"
procmail: Assigning "spc=([-.,@#$%&*()+=|:;!^_/1\])?"
procmail: Assigning "a=([a(_at_)]|/)"
procmail: Assigning "i=([íi|1le]|ii)"
procmail: Assigning "l=[l1]"
procmail: Assigning "o=[o0]"
procmail: Assigning "v=(v|\/)"
procmail: Assigning 
"valium=(v|\/)([-.,@#$%&*()+=|:;!^_/1\])?([a(_at_)]|/)([-.,@#$%&*()+=|:;!^_/1\])?[l1]"
procmail: Assigning 
"pharm=(v|\/)([-.,@#$%&*()+=|:;!^_/1\])?([a(_at_)]|/)([-.,@#$%&*()+=|:;!^_/1\])?[l1]"
procmail: Score:       0       0 
"^From:.*\/((v|\/)([-.,@#$%&*()+=|:;!^_/1\])?([a(_at_)]|/)([-.,@#$%&*()+=|:;!^_/1\])?[l1])"

I cannot see why the first one should fail!  Is this a procmail bug?
As such the script will fail on "v/\lium"!

Testing \/alium, produces:

X-junk: /al

missing the leading backslash.  Since it does match "\/"there seems to
be an anchoring problem in identifying the match.

This is procmail v3.22 2001/09/10.

Thanks in advance.

Best regards,

        --Ralph

____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail


<Prev in Thread] Current Thread [Next in Thread>