process(_at_)qz(_dot_)little-neck(_dot_)ny(_dot_)us (Eli the Bearded) writes:
procmailrc(5) states that procmail has a fully compatible regexp
engine with egrep. What egrep? There are many versions. Here are
some regexp problems I have that I want to work in procmail.
The _original_ egrep: extended, but not bloated, regexps. In newer
procmail versions the procmailrc(5) manpage describes what procmail's
regexps support.
Before I answer the rest of your questions, have you considered just
*TRYING* them? Then you know the answers you get are correct.
Is that first character class valid?
* !^Subject:(.Re:|(.*[[({ -]([wW]as|Re):))
Yes. Since procmail regexps are case-insensitive by default, you don't
need the second class.
Can I use ^ or $ anchors inside parens?
* ^From:.*@([^> ]*\.)?usa.net([^-a-z0-9.]|$)
Yes.
Can I code this perl regexp in procmail?
/^From:.*[ <]([^\(_at_)]+\@(\1|[^.]{13,})\.(com|net)([ >]|$)/i
(Ie, is {} understood and can I do backreferences?)
Procmail doesn't support backreferences. Besides, that's not a valid
perl regexp, as you left out a closing paren somewhere. The check
implied by that regexp _can_ be faked really closely in procmail with
careful use of MATCH.
Oh, and braces are just syntactic sugar anyway.
...
Why can't procmail have a real (zero-width) word boundry operator?
You'll have to ask Stephen that questions. Before you do so, consider
the return question: what can you do with them that you can't with the
non-zero width boundary tokens that procmail implements? Think _very_
carefully here...
Does procmail know about [:alpha:] and company? How about [=n=]
and company? Does procmail use locale for [a-z] expansions? (Perl
does not.)
No, no, and no. What's the correct locale to use for an email message
anyway? The headers should *never* contain 8bit characters, and the
body can be in any locale. Or do you believe in the One True Locale?
<chuckle>
BTW: if you find a program that uses locales for [a-z], then it's
broken. Locales should only be used for the [:foo:], [.ch.] and [=n=]
blobs.
Philip Guenther