On Tue, Sep 16, 1997 at 10:38:00AM -0500, Eli the Bearded wrote:
vikas(_at_)insight(_dot_)att(_dot_)com wrote:
If the regular expression contains `^TO_' it will be substi-
tuted by `(^((Original-)?(Resent-)?(To|Cc|Bcc)|(X-Envelope
|Apparently(-Resent)?)-To):(.*[^-a-zA-Z0-9_.])?)', which
should catch all destination specifications containing a
specific address.
Let's rewrite that in perl /x format.
[excellent regexp breakdown snipped]
Is that clearer?
Yes. Some followup though. See below.
If the regular expression contains `^TO' it will be substi-
tuted by `(^((Original-)?(Resent-)?(To|Cc|Bcc)|(X-Envelope
|Apparently(-Resent)?)-To):(.*[^a-zA-Z])?)', which should
catch all destination specifications containing a specific
word.
This is the same except that block (E) ends with anything other than a
letter.
1. How does it handle the real-name part of the TO address? This can
contain pretty much any garbage according to RFC822 as long as the address
is enclosed between <...> ?
Block (E) is there to slurp up that part. The <encapsulation> is not
needed, and a case such as:
From: "jester(_at_)fun(_dot_)house" <fool(_at_)aol(_dot_)com>
Will confuse a test for "^TO_jester@". Yes, I have seen people do
that stuff, apparently not even maliciously.
What about
From: someone(_at_)somewhere(_dot_)com <another(_at_)one(_dot_)com>
Per RFCs, another(_at_)one(_dot_)com is the real email address here, right? So
TO/TO_
should catch it and *not* someone(_at_)somewhere(_dot_)com, right?
Well, shouldnt things like (,),< and > be included in the reg-exp above
since they 99% of the time indicate the start of a ACTUAL email address?
3. What is the essential difference (again, in plain English, please!)
between TO and TO_ ?
The definition of the word boundary in block (E).
What is the assumption being made here? Why will TO_address catch
everything to 'address' while TOword catch everything to 'word'?
4. What in the world is (.*[^-a-zA-Z0-9_.])? ?
Still cant get something. Wont the <space> after the to: make both ^TO and
^TO_ end there without going any further since the [^.....] above says end
the regexp at anything that does not contain alphanumeric, _ and . and -?
So the .* would match 0 atoms and [^-a-zA-Z0-9_.] would match the space and
thats it! What am I missing?
Thanks, Eli.
Vikas