procmail
[Top] [All Lists]

Re: \/ is not the same as ()\/

1997-11-17 17:20:34
James Waldby <j-waldby(_at_)uiuc(_dot_)edu> writes:
...
Now, one more question.  With following in .procmailrc : 
:0B
* ^^\/.*$?.*$?.*$?
{ LOG=" a |$MATCH| a " }

:0B
* ()\/.*$?.*$?.*$?
{ LOG=" b |$MATCH| b " }

:0B
* \\/.*$?.*$?.*$?
{ LOG=" c |$MATCH| c " }

after the command "formail < mm  -s procmail" with one message in mm
with a message body that begins:
L1
L2
L3
L4
the log file said
a | L1
L2
L3| a  b | L1
L2| b  c | L1
L2| c 
which means, after \/.*$?.*$?.*$?  $MATCH was first three lines of body,
but after ()\/.*$?.*$?.*$? or \\/.*$?.*$?.*$? patterns, only first two
lines.
Why?

Okay here goes: procmail starts the body with an implicit newline to
handle the matching of ^ and ^^ (and a leading $ if you wanted by
confusing).  With the first recipe, the newline is 'eaten' by the ^^
(a single carat would work just as well).  With the others however,
procmail will match zero instances of anything in the first ".*" so
that it can match that initial newline with the first dollar sign.
Procmail always takes the match that starts the *earliest*.

To prove that this is the case, try the following recipe:

        :0B
        * ()\/.+$?.*$?.*$?
        { LOG=" b |$MATCH| b " }

With the '+' in there instead of a '*', it has to skip past the leading
newline to the first real line.  Of course, the above will skip all
blank lines, not just the implicit leading one, so it might not be
exactly what you want to do.

To sum it up: explicit anchoring is a Good Thing.


Philip Guenther