procmail
[Top] [All Lists]

Re: what regexps work?

1997-04-09 17:50:00
process(_at_)qz(_dot_)little-neck(_dot_)ny(_dot_)us (Eli the Bearded) writes:
procmailrc(5) states that procmail has a fully compatible regexp
engine with egrep. What egrep? There are many versions. Here are
some regexp problems I have that I want to work in procmail.

The _original_ egrep: extended, but not bloated, regexps.  In newer
procmail versions the procmailrc(5) manpage describes what procmail's
regexps support.

Before I answer the rest of your questions, have you considered just
*TRYING* them?  Then you know the answers you get are correct.


Is that first character class valid?

      * !^Subject:(.Re:|(.*[[({ -]([wW]as|Re):))

Yes.  Since procmail regexps are case-insensitive by default, you don't
need the second class.


Can I use ^ or $ anchors inside parens?

      * ^From:.*@([^> ]*\.)?usa.net([^-a-z0-9.]|$)

Yes.


Can I code this perl regexp in procmail?

      /^From:.*[ <]([^\(_at_)]+\@(\1|[^.]{13,})\.(com|net)([ >]|$)/i

(Ie, is {} understood and can I do backreferences?)

Procmail doesn't support backreferences.  Besides, that's not a valid
perl regexp, as you left out a closing paren somewhere.  The check
implied by that regexp _can_ be faked really closely in procmail with
careful use of MATCH.

Oh, and braces are just syntactic sugar anyway.

...
Why can't procmail have a real (zero-width) word boundry operator? 

You'll have to ask Stephen that questions.  Before you do so, consider
the return question: what can you do with them that you can't with the
non-zero width boundary tokens that procmail implements?  Think _very_
carefully here...


Does procmail know about [:alpha:] and company? How about [=n=]
and company? Does procmail use locale for [a-z] expansions? (Perl
does not.)

No, no, and no.  What's the correct locale to use for an email message
anyway?  The headers should *never* contain 8bit characters, and the
body can be in any locale.  Or do you believe in the One True Locale?
<chuckle>

BTW: if you find a program that uses locales for [a-z], then it's
broken.  Locales should only be used for the [:foo:], [.ch.] and [=n=]
blobs.


Philip Guenther

<Prev in Thread] Current Thread [Next in Thread>