procmail
[Top] [All Lists]

Re: A tool for refining regex

2002-02-05 00:34:04
PSE-L(_at_)mail(_dot_)professional(_dot_)org (Professional Software 
Engineering) writes:

Is there some syntactical error or should I be seeing the actual
header line that matched, printed into my log?

        * 1^1 ^\/To:(_dot_)*(_at_)pop\(_dot_)newsguy\(_dot_)com
        * 1^1 ^\/Received:.*\/(\.tw |\.kr|[^0-9.]202\.|[^0-9.]211\.|\
         [^0-9.]6[1-6]\.|bogota\.supernet\.com\.co)

WTF with the TWO match expressions on the one line?  Lose the one
before the parenthesis after Recieved.  See how there's only ONE "\/"
on the To line?  That starts the MATCH construct, which you see below:

I must have overzealously  followed what some knot head posted
here in: 

     Message-id: 
<5(_dot_)1(_dot_)0(_dot_)14(_dot_)2(_dot_)20020204193229(_dot_)0710e490(_at_)mail(_dot_)professional(_dot_)org>

He he ... I don't get many laughs ... thanks.

a second MATCH construct on the line, which limits the match to the
immediate text instead of the line from the beginning (re-read my post
if you want the full line, though I think just fixing this immediate
error on your part should give you something infinitely more useable).

Now we're talking:

        * 1^1 ^\/To:(_dot_)*(_at_)pop\(_dot_)newsguy\(_dot_)com(_dot_)*
        * 1^1 ^\/Received:.*(\.tw |\.kr|[^0-9.]202\.|[^0-9.]211\.|\
          [^0-9.]6[1-6]\.|bogota\.supernet\.com\.co).*

Gets (wrapped for mail):

procmail: Matched "Received: from embryo.home.madduck.net
(p3E9D112C.dip.t-dialin.net [62.157.17.44]) by diamond.madduck.net
(postfix) with ESMTP id 070571101C for 
<debian-user(_at_)lists(_dot_)debian(_dot_)org>;
Sat, 2 Feb 2002 12:01:00 -0500 (EST)"

========================================

I just didn't know how to use the match operator to do this.  I see
now that my original theory in OP is actually one way to do this:

Harry said in Message-id: 
<m1zo2vtr1q(_dot_)fsf(_at_)reader(_dot_)newsguy(_dot_)com>:
I'v use the MATCH operator a little and suspect it could be brought to
bare here to grab the line somehow.  Maybe having the regex twice.
Once in  a match and second as a screen like the one above.
But probably don't want to grab line that hit by the NOT (!) operator
since that wouldn't help clear it up much.

Turns out having the regex twice as described does what I wanted after
all:

Forgetting the trickey stuff for a moment:

This:
       [...] string of NOT operators snipped
   
        * ^\/Received:.*(\.tw |\.kr|[^0-9.]202\.|[^0-9.]211\.|\
          [^0-9.]6[1-6]\.|bogota\.supernet\.com\.co).*
        * ^Received:.*(\.tw |\.kr|[^0-9.]202\.|[^0-9.]211\.|\
          [^0-9.]6[1-6]\.|bogota\.supernet\.com\.co).*
       [...]

Gets this (different message - same author) :

   procmail: Match on ! "^To:(_dot_)*reader(_at_)newsguy(_dot_)com"
   procmail: Match on ! "^Received:.*smtp10"
   procmail: Assigning "MATCH="
   
   procmail: Matched "Received: from diamond.madduck.net (66.92.234.132)
   by murphy.debian.org with SMTP; 2 Feb 2002 17:01:31 -0000"
   
   procmail: Match on "^\/Received:.*(\.tw
   |\.kr|[^0-9.]202\.|[^0-9.]211\.|[^0-9.]6[1-6]\.|bogota\.supernet\.com\.co).*"
   
   procmail: Match on "^Received:.*(\.tw
   |\.kr|[^0-9.]202\.|[^0-9.]211\.|[^0-9.]6[1-6]\.|bogota\.supernet\.com\.co).*"
   
   procmail: Assigning "LASTFOLDER=spam_suspect2.in"


This may be all I was really after...
_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>