[Top] [All Lists]

Re: searching for a word in message body

1996-06-11 12:20:05
Dilip Vazirani asked,

| I am trying to look up for a word in message body :
| Example : look for word mac in message body.
|           This should not pick up the words machine or bob-mac.
| I tried :
| :0 B
| *\(mac\)
| xyz
| What am  I missing

You're missing sharp corners on those parentheses.  The word-boundary expres-
sions [actually, they mean a word boundary in egrep; in procmail they match a
newline or any character that normally wouldn't be in a word, but there must
be non-word a character for them to match, not just a boundary] are made up
of backslashes and angle brackets:

 \< and \>

\( and \) match literal parentheses in the text, so your current recipe won't
save to $MAILDIR/xyz unless the body contained "(mac)", parentheses and all.
By the way, the recipe also needs a local lockfile unless $MAILDIR/xyz is
itself a directory.

Unfortunately, there is yet another complication: "\<" at the start of the
search expression will be taken to mean "this is a literal left-side angle
bracket, not a size comparison operator."  Thus you need *two* backslashes.
[The same problem arises when you want to start a search expression with the
\/ extraction operator.]  So it's like this:

 * \\<mac\>

Here's one way to avoid the extra backslash:

 * .*\<mac\>

I believe that \< and \> will match a hyphen, though, so while we have
ruled out "machine" we still have "bob-mac" to worry about.  All righty:

 * 1^1 .*\<mac\>
 * -1^1 -mac\>|-mac-|\<mac-

That means to score one point for any appearance of "mac" as a word, but
take a point back for every occurrence of "-mac-" or a word ending in "-mac"
or a word beginning with "mac-".  (Note that, because \< and \> need a char-
acter or a newline to match to and because only non-overlapping occurrences
register, "-mac mac mac-" will net a score of 2-2=0 and leave procmail
feeling that it didn't find "mac" despite there being a qualifying one in
the middle.  That may cause a problem.)

<Prev in Thread] Current Thread [Next in Thread>