procmail
[Top] [All Lists]

Re: matching subjects with words in []

1997-01-25 17:40:45
jeff(_dot_)covey(_at_)pobox(_dot_)com (jeffrey covey) writes:
i hadn't even tried it yet with procmail.  i thought that since the
procmailrc manual says "The regular expressions are *completely*
compatible to the normal egrep extended regular expressions" the
results of my tests with egrep should be identical to what procmail
would do.  is this a bad assumption?  i ran egrep over an existing
mail file with

      egrep ^Subject:.*[[MAC]] foo

and got several matches, but got an error when i tried

      egrep ^Subject:.*([[MAC]]|[[ATARI]]) foo

(it must have thought i vas trying to pipe to a program named
"[[ATARI]])".)  wouldn't want this anyway, since [[MAC]] would also
match, for example, [C], right?

Yes.  To quote the egrep manpage under Solaris 2:

     Be careful using the characters $, *, [, ^, |, (, ),  and  \
     in full regular expression, because they are also meaningful
     to the shell.  It is  safest  to  enclose  the  entire  full
     regular expression in single quotes '...'.

Next, the regexp
        [[MAC]]
is parsed as a character group containing "[MAC", followed by a literal "]",
and will thus match any of
        []
        M]
        A]
        C]
Not what you wanted.  To have a literal open bracket you can either enclose
it in brackets by itself or precede it with a backslash, ala:
        [[]MAC]
or      \[MAC]

I personally find the latter clearer than the former, but it's a
personal choice.  Note that you escape the close bracket for symmetry
if you want to, but there's no need.


so i thought i was doing the right thing with:

      egrep ^Subject:.*\[MAC\] foo

The backslashes would be 'eaten' by the shell, so egrep would never see
them.


and

      egrep ^Subject:.*(\[MAC\]|\[ATARI\]) foo

hmmmm...  these both gave error messages when i tried them on my linux
box earlier, but now they worked ok when i telneted into irix at school
(after i changed the () in the second one to "").

Well yeah, put quotes in and you've taken care of the problem with the
shell.


so i guess i have two questions at this point:

1.  what would be the correct syntax for procmail to match on more
   than one of these?  would (\[foo\]|\[bar\]) be what i'm looking
   for?  will give it a try...

That would work, as would:

        \[(foo|bar)\]


2.  more generally, how far can one trust the correspondence between
   procmail and egrep?  to what extent can i use egrep as a test for
   what procmail will do before i trust it with my mail?

I'd suggest just using procmail itself for your testing, then it
doesn't matter.  Put a recipe you want to test in a separate file then
use "procmail -m recipe-file" to test it:

lunen% cat test
VERBOSE = on
:0
* ^Subject:.*\[(foo|bar)\]
/dev/null
lunen% procmail -m test
Subject: skjhskjhd [foo]

dkjghg
procmail: [8027] Sat Jan 25 18:28:08 1997
procmail: Match on "^Subject:.*\[(foo|bar)\]"
procmail: Assigning "LASTFOLDER=/dev/null"
procmail: Opening "/dev/null"
 \xF0
  Folder: /dev/null                                                          33
lunen% procmail -m test
Subject: kjhs foo

procmail: [8075] Sat Jan 25 18:28:51 1997
procmail: No match on "^Subject:.*\[(foo|bar)\]"
 \xF0
  Folder: **Bounced**                                                         0
lunen%


If you put
        :0
        /dev/null
at the bottom, then you can just feed it mail and watch the verbose log
output to see what is matching and what isn't.

Philip Guenther