procmail
[Top] [All Lists]

Re: Et TO, procmail ?

1997-09-16 10:54:45
On Tue, Sep 16, 1997 at 10:38:00AM -0500, Eli the Bearded wrote:
vikas(_at_)insight(_dot_)att(_dot_)com wrote:
      If the regular expression contains `^TO_' it will be substi-
          tuted by `(^((Original-)?(Resent-)?(To|Cc|Bcc)|(X-Envelope
          |Apparently(-Resent)?)-To):(.*[^-a-zA-Z0-9_.])?)', which
          should catch all destination specifications containing a
          specific address.

Let's rewrite that in perl /x format.
[excellent regexp breakdown snipped]

Is that clearer?

Yes. Some followup though. See below.


          If the regular expression contains `^TO' it will be  substi-
          tuted by `(^((Original-)?(Resent-)?(To|Cc|Bcc)|(X-Envelope
          |Apparently(-Resent)?)-To):(.*[^a-zA-Z])?)', which should
          catch all destination specifications containing a specific
          word.

This is the same except that block (E) ends with anything other than a
letter.

1. How does it handle the real-name part of the TO address? This can
contain pretty much any garbage according to RFC822 as long as the address
is enclosed between <...> ?

Block (E) is there to slurp up that part. The <encapsulation> is not
needed, and a case such as:

      From: "jester(_at_)fun(_dot_)house" <fool(_at_)aol(_dot_)com>

Will confuse a test for "^TO_jester@". Yes, I have seen people do
that stuff, apparently not even maliciously.

What about 
From: someone(_at_)somewhere(_dot_)com <another(_at_)one(_dot_)com>

Per RFCs, another(_at_)one(_dot_)com is the real email address here, right? So 
TO/TO_
should catch it and *not* someone(_at_)somewhere(_dot_)com, right? 

Well, shouldnt things like (,),< and > be included in the reg-exp above
since they 99% of the time indicate the start of a ACTUAL email address?

3. What is the essential difference (again, in plain English, please!)
between TO and TO_ ?

The definition of the word boundary in block (E).

What is the assumption being made here?  Why will TO_address catch
everything to 'address' while TOword catch everything to 'word'?

4. What in the world is (.*[^-a-zA-Z0-9_.])?   ?

Still cant get something. Wont the <space> after the to: make both ^TO and
^TO_ end there without going any further since the [^.....] above says end
the regexp at anything that does not contain alphanumeric, _ and . and -?
So the .* would match 0 atoms and [^-a-zA-Z0-9_.] would match the space and
thats it! What am I missing?

Thanks, Eli.

Vikas

<Prev in Thread] Current Thread [Next in Thread>