procmail
[Top] [All Lists]

procmail's regexp quirks

1997-10-25 19:40:35
Paul Dunne asked,

| Procmail's regexps are supposed to be compatible with egrep's, are they not?

They were originally, but they have diverged.  Most of the discrepancies are
covered in the procmailrc(5) man page.

Particularly,

1. ^ and $ are non-zero-width and anchor to real or putative newlines (rather
   than to the zero-width start and end of a line);
2. An initial ^^ or a final ^^ anchors to the opening or closing putative
   newline respectively;
3. ^ and $ in the middle of a procmailrc regexp match to an embedded newline
   (and must be escaped to match to a caret or a dollar sign);
4. \< and \> are non-zero-width and match to a character that wouldn't be in
   a word (or to a real or putative newline) [rather than to the zero-width
   transition into or out of a word];
5. *, ?, and + in the absence of \/ are stingy rather than greedy, and that
   generally won't matter, but in the presence of \/ they are stingy to the
   left of \/ and greedy to the right of \/, while in most applications the
   leftmost wildcard on a line is the greediest and greed decreases from left
   to right.

<Prev in Thread] Current Thread [Next in Thread>