unexpected procmail recipe behavior


The procmail manpages say that procmail regexes behave "exactly" as egrep ones.

I'm baffled as to why this conditon:
  * ^References:.*[<]\/[^>][^>]*
sets MATCH to the  _first_ Message-ID on the line, and not the _last_ one.

Standard regex definition is that repeat patterns are 'greedy' -- i.e.,
that given multiple matches at the same starting point, the -longest-
match will be chose.

Thus, to my understanding, the sequence '.*[<] should match evrything up to
the -last- "<" on the line.

and, in fact, using a sed(1) substitution "s/^.*[<]([^>][^>]*)[>].*$/\1/"
I do get the desired last tiem.


Can anyone 'explain this thing to me" or (with apolgies to The Weavers)
should this old fool just put his chamberpot on his head, saddle up his
milk cow, and ride away ?  *GRIN*

I've had to re-write 'modern egrep'-compliant regexes many times, to get
the desired behavior out of procmail. But, how to re-write this one, where
there may be a variable number of '<...>' items on the line, has got me
stumped.


Does anybody have a _detailed_ description of exactly how procmail's 
'internel pattern-match' works?  Saying it 'works like egrep(1)' is 
woefully incomplete/inaccurate, given the changes to egrep's behavior 
that have occured over the years.

You know 'minor', 'insignificant', things like which 'repeat' characters
it recognizes, whether it uses "("/")" or "\("/"\)"  for grouping, whether
it recognizes 'named' character classes -- e.g. " [:upper:]",  how it
resolves 'ambiguous' matches,  etc.    <wry grin>


____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)de
http://mailman.rwth-aachen.de/mailman/listinfo/procmail