procmail
[Top] [All Lists]

Re: unexpected procmail recipe behavior

2011-09-25 17:38:08
Robert Bonomi wrote,

The procmail manpages say that procmail regexes behave "exactly" as egrep ones.

If you're not extracting substrings with \/ and all you care about is the presence or absence of a match, they don't behave the same but they get the same results.

But if you are extracting substrings, they don't get the same results, and "exactly" isn't accurate.

I'm baffled as to why this conditon:
   * ^References:.*[<]\/[^>][^>]*
sets MATCH to the  _first_ Message-ID on the line, and not the _last_ one.

Standard regex definition is that repeat patterns are 'greedy' -- i.e.,
that given multiple matches at the same starting point, the -longest-
match will be chose.

Procmail's regexp matching is stingy in the absence of \/; in the presence of \/, it is stingy to the left of \/ and greedy to the right.

I've had to re-write 'modern egrep'-compliant regexes many times, to get
the desired behavior out of procmail. But, how to re-write this one, where
there may be a variable number of '<...>' items on the line, has got me
stumped.

I can't think of a way to write it in a single condition, but I can in two:

 * ^References:.*(<.*>)?<\/[^>]+>$
 * MATCH ?? ^^\/[^>]+
____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)de
http://mailman.rwth-aachen.de/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>