procmail
[Top] [All Lists]

Re: whitelist matching

2002-08-12 01:13:26
When Dave Kirkby asked,

| > I'm not able to follow the syntax of this one:
|
| > :0:
| > * ^From:.*\/\<[a-z0-9=+_-]+(_at_)blow(_dot_)com

Lawrence Mitchell responded,

| The \/ is a procmailism which will store everything _after_ it
| matched by the regexp in a variable, $MATCH.  \< matches a word-boundary.

Yes to the first sentence, no to the second.  In egrep and perl \< matches a
word boundary; but in procmail it matches a character that would not be in a
word, that is to say, a punctuation mark or a newline.  It matches an actual
character, not a zero-width position between characters as in perl or egrep.
Likewise, procmail's \> has to match an actual character.

| For example, if the From: header consisted of "me 
<foobar(_at_)blow(_dot_)com>",
| then $MATCH would have the value of "foobar(_at_)blow(_dot_)com".

No, it would have the value "<foobar(_at_)blow(_dot_)com" because the \< is to 
the right
of \/ and would be matched to the less-than sign.  In that particular example,
starting the extraction to the right of the non-word character, like this,

 * ^From:(.*\<)?\/[a-z0-9=+_-]+(_at_)blow\(_dot_)com

(you can get away, probably, without escaping the period, but maybe not) would
probably be a better idea.  Just in case the address is flush up against the
colon without even a space, we make it optional to have anything at all
between From: and the address, but if there is something there, it has to end
in a non-word character.




_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail