procmail
[Top] [All Lists]

handling spacing in patterns

1997-08-19 10:40:53
You recently posted to procmail-l the following recipe snippet. It has
two problems of which you should be aware.

"reply (with)? (the)? (word)? ['"]?remove['"]? in the (subject|body)"

1) You fail to include leading or trailing spaces with optional text.
This is a serious fault which significantly lessens the effectiveness
of the recipe. As an example, your pattern will match
        reply with   'remove' in the subject
but not
        reply with 'remove' in the subject
which is a much more likely occurence.

2) Your recipe does not detect phrases which are split over more than
one line. The longer the phrase for which you are testing, the more
likely it is that it will be split (except for mail composed with
editors which only place newlines at paragraph breaks).

I use \>* instead of a space or space+tab character class whenever
there is free-form text to be examined, including text within subject
headers. While procmail will combine multiple lines into one line in
headers, it does not compress white space. This form accomodates
erratic spacing in addition to line-spanning phrases.

The following version duplicates what I have assumed you are attempting
to do, but overcomes both of the above problems:

  
reply\>*(with\>*)?(the\>*)?(word\>*)?['"]?remove['"]?\>*in\>*the\>*(subject|body)

My own recipe includes
        remove['">]?\>*(as|in)\>*the\>*(body|subject)
and has been tripped many times, including those in which the beginning
and ending were on separate lines, and twice in which the spacing was
erratic.

-- 
Rik Kabel          Old enough to be an adult              
rik(_at_)netcom(_dot_)com

<Prev in Thread] Current Thread [Next in Thread>