procmail
[Top] [All Lists]

Re: bug in ^TO_ macro: character '+' *is* allowed in emails

2004-11-26 05:35:26
Why don't you show me some analysis

the part:
:(.*[^-a-zA-Z0-9_.])?)

means: a column, followed by any number of any characters, followed
(possibly) by one character not in the set [-a-zA-Z0-9_.]

the set is apparently meant to be the legal set of characters that can be
used in the left part of an email address.

the idea is that:

^TO_hugs(_at_)foo\(_dot_)com

would match hugs(_at_)foo(_dot_)com but would not match 
bighugs(_at_)foo(_dot_)com ,
big-hug(_at_)foo(_dot_)com or big-hug(_at_)foo(_dot_)com

naturally, it would match hugs(_at_)foo(_dot_)com(_dot_)ca , so for a good use, 
another
regexp should be added after,
like:

^TO_hugs(_at_)foo\(_dot_)com[^-a-zA-Z0-9(_dot_)]?

(since only letters, digits, hyphens and . are allowed in a domain name)



-t


----- Original Message ----- 
From: "Dallman Ross" <dman(_at_)nomotek(_dot_)com>
To: <procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE>
Sent: Friday, November 26, 2004 4:12 AM
Subject: Re: bug in ^TO_ macro: character '+' *is* allowed in emails


On Fri, Nov 26, 2004 at 03:08:27AM -0800, Tristan Savatier wrote:

Your supposition above is simply not true.  The macro makes no such
assumption.  Actually, I'm not exactly sure *what* it does do (see
below); but what it *does not* do is limit certain characters in
your match.

the macro ^TO_ assumes that the first part of the email (before the
@) contains only [-a-zA-Z0-9_.], and that any other character is a
separator (right before the address).

it is quite obvious if you look at the reg exp, and you can see
exactly what this macro does.

I don't know see how you can say anything of the kind is "quite
obvious if you look."  I did look.  I stared at it for about ten
minutes, in fact.  It's not obvious to me.  I may not be the Regex
King of the Third Millennium, and I may not be correct, but
right or wrong, there's nothing "quite obvious" about it.

In fact, I looked at the macro, decided what I thought it would match
from what my eyeballs told me, tested my hypotheses, and confirmed
them.  So whatever the macro is supposed to do, I didn't feel
surprised by my results on studying the regex.

Why don't you show me some analysis instead of just making
pronouncements about what is so "obvious" to you, please, as I
attempted to do to counter your contention?


Let's deconstruct that a bit.  I believe it says it's looking
for a header line that can start (anchored left) with any of
those header-y words in it; folowed by a colon (demarking
end-of-headername- field); followed by the grouping of: ".*"
(anything at all!  And here is where comes your addresses that
match!), that being followed by a class with a caret in front of it,
which means "any character NOT in this class."

So the macro attempts to match your idea of an address, and it
accepts *anything at all* there, ending the search area only with
the *next* zeroth or first occurrence of a, shall we say boundary,
character.

What is faulty there?  I'm not meaning to imply I must be right.
I'm asking you to show me where I'm wrong -- with more than
just pronouncements, please.

Look, we can put your plus sign in there to see if it makes any
difference.  My impression is that it will make zero difference.
Let's try it.  Nearly identical test as before, using my funky made-up
header line:

 X-Envelope-To: Pooh Bear
##***you___+++!your!address!com***###(_at_)somelist(_dot_)somedom [XYZ-list]


Here's the recipe:

INCLUDERC = ./MYTO_.rc

 :0
 * $ $MYTO_\/##(_at_)[^o]*
 { FOO = $MATCH }


Here's what's in the INCLUDERC:

----------------
#       If the regular expression contains `^TO_' it will be substituted
by
#       `(^((Original-)?(Resent-)?(To|Cc|Bcc)|(X-Envelope
#       |Apparently(-Resent)?)-To):(.*[^-a-zA-Z0-9_.])?)', which should
catch
#       all destination specifications containing a specific address.

# Let's redefine the macro:

    MYTO_ = '(^((Original-)?(Resent-)?(To|Cc|Bcc)|(X-Envelope|\
             Apparently(-Resent)?)-To):(.*[^-a-zA-Z0-9_.+])?)'

   # Just like the original, but with a plus-sign added
----------------


And here's the result (identical to before!):

procmail: Matched "##(_at_)s"
procmail: Match on
"(^((Original-)?(Resent-)?(To|Cc|Bcc)|(X-Envelope|Apparently(-Resent)?)-To):
(.*[^-a-zA-Z0-9_.+])?)\/##(_at_)[^o]*"
procmail: Assigning "FOO=##(_at_)s


-- 
dman

____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail



____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>