Vikas Agnihotri <vikas(_at_)insight(_dot_)att(_dot_)com> replied to my reply to
him:
To: procmail(_at_)Informatik(_dot_)RWTH-Aachen(_dot_)DE
...
Cc: procmail(_at_)Informatik(_dot_)RWTH-Aachen(_dot_)DE
Hmmm. Standard SmartList setup will drop the second copy.
1. How does it handle the real-name part of the TO address? This can
contain pretty much any garbage according to RFC822 as long as the address
is enclosed between <...> ?
From: "jester(_at_)fun(_dot_)house" <fool(_at_)aol(_dot_)com>
Will confuse a test for "^TO_jester@".
What about
From: someone(_at_)somewhere(_dot_)com <another(_at_)one(_dot_)com>
Will confuse the regexp.
Per RFCs, another(_at_)one(_dot_)com is the real email address here, right?
So TO/TO_
should catch it and *not* someone(_at_)somewhere(_dot_)com, right?
If you only test for "another(_at_)one(_dot_)com", it is not a problem. If you
are checking for "someone(_at_)somewhere(_dot_)com" you are out of luck.
Well, shouldnt things like (,),< and > be included in the reg-exp above
since they 99% of the time indicate the start of a ACTUAL email address?
I don't like the ^TO and ^TO_ macros for most things and typically use
stuff like this:
^(Resent-)?(To|CC):.*[< ]{address}([ >]|$)
It still can be confused, but the things that will cause problems
are fairly rare in practice. You might prefer something like this:
^(Resent-)?(To|CC):([^(]+([(].*[)])?)*[, <]{address}([, >]|$)
Which can correctly deal with
To: (hatter(_at_)tea(_dot_)party) {address}
To: (fake {address})
bill(_dot_)the(_dot_)lizard(_at_)the(_dot_)jury(_dot_)box
To: Alice <alice(_at_)the(_dot_)croquet(_dot_)game>, "W. Rabbit (late)"
<hare(_at_)small(_dot_)hole>, Gentle Reader <{address}>
To: jabberwocky(_at_)vorpal(_dot_)swords(_dot_)r(_dot_)us,
duchess(_at_)the(_dot_)croquet(_dot_)game,
chesire(_at_)no(_dot_)where, {address},
dinah(_at_)meow(_dot_)org
It will still fail for
To: (fake <{address}>) mockturtle(_at_)tortoise(_dot_)edu
If someone is malicious enough to send you such mail.
3. What is the essential difference (again, in plain English, please!)
between TO and TO_ ?
The definition of the word boundary[...]
What is the assumption being made here? Why will TO_address catch
everything to 'address' while TOword catch everything to 'word'?
^TOgryphon will match both of the following, while ^TO_gryphon will only
match the second.
To: gryphon(_at_)lobster(_dot_)quadrille
To: the(_dot_)gryphon(_at_)sleeping(_dot_)by(_dot_)a(_dot_)cliff
4. What in the world is (.*[^-a-zA-Z0-9_.])? ?
Still cant get something. Wont the <space> after the to: make both ^TO and
^TO_ end there without going any further since the [^.....] above says end
the regexp at anything that does not contain alphanumeric, _ and . and -?
So the .* would match 0 atoms and [^-a-zA-Z0-9_.] would match the space and
thats it! What am I missing?
If you just have "^TO" or "^TO_", then in fact, the space wont even be
matched because of procmail non-greediness. It is the *context* of what
you put after the macro that will force the .* to slurp up a bunch of
text.
Elijah
------
I /dev/null dupes, no need to CC list posts. It is not my responsibility to
prove to you my mail is not spam, if mail to you bounces it will not be resent.