procmail
[Top] [All Lists]

Re: Invalid message-ids

1997-08-11 12:19:00
Sean Straw described Timothy Luoma's regexp, which was this,

L> * ! ^Message-ID:\ \<\>

thus,

S> ! IF NOT MATCH
S> ^ STARTING AT BEGINNING OF LINE

Not exactly, but that description will do.

| "Message-ID:"
| "\ " - one space      (Normal regexp for this is "\s", but egrep doesn't
| recognize it -- "\ " IS valid though, as would be a non-escaped space.
| [:space:] should work too.)
| "\<" - Match the empty string BEFORE a word (see man procmailrc)
| "\>" - Match the empty string AFTER a word (see man procmailrc)
| 
| (that is \< and \> are metas similar to ^ and $)

Except that we're dealing in procmailrc code here, not in perl, so 

(1) "[:space:]" would not work.  It would match a single character that
had to be a colon, an s, a p, an a, a c, or an e.

(2) \< and \> do not match empty strings.  They match one character (pos-
sibly a newline) that shouldn't be inside a word.  Nor are they particu-
larly confined to beginning and end respectively: \>wordhere\< would be
nonsense in egrep or perl but it is legitimate procmailrc code equivalent
to \<wordhere\>.

In egrep and perl, \< and \> do anchor to empty strings as Sean described.

Technically, in procmailrc code ^ and $ do not match empty strings either,
but almost always we can get away with thinking of them that way.  (Again,
in grep, egrep, sed, and perl they *do* match empty strings.)  However,
it can cause you major problems to think of procmail's \< and \> that way.
Eli the Unshaven, for one, has stated that procmail's definitions of \<
and \> have caused trouble for him, trouble that is among his reasons for
working on a way to combine procmailrc code and perl.

<Prev in Thread] Current Thread [Next in Thread>