Dan Kanagy's been bitten by the leading backslash annoyance:
| I'm trying to match text in the body of e-mail that begins with "$B"
| and ends in either "(B" or "(J" on a single line. The "$B" may or may
| not start the line and the "(B" or "(J" may or may not end the line.
|
| I believe I need to escape the "$" and the "(", so I've tried
|
| :0 BD
| * \$B.*(\(B|\(J)
|
| but I don't get a match with this. What might I be doing wrong?
The problem is a bit of counterintuitive operation when you want to escape
the first character in a regexp. Procmail takes an opening backslash to
mean "end of whitespace" and strips it. Thus it looks for this expression:
$B.*(\(B|\(J)
and regards the opening "$" to mean "newline" rather than "dollar sign".
In fact,
\\$B.*(\(B|\(J)
would work, though to our eyes we'd expect "\\" to match a literal backslash
in the text. As I said, the situation with opening backslashes is highly
counterintuitive.
The general solution is to protect the beginning of the regexp with "()"
[also, you might as well put the literal left parenthesis outside the
alternation, because it's part of both "(B" and "(J"]:
* ()\$B.*\((B|J)
which simplifies further to this:
* ()\$B.*\([BJ]
Dollar signs are a special problem, because "$" interpretation can also
affect them, making them represent newlines when you thought they'd be
literal. Fortunately, there are ways to tame them:
[$] always matches a literal dollar sign in the search area.
($) always matches a newline in the search area.