procmail
[Top] [All Lists]

Re: bug in ^TO_ macro: character '+' *is* allowed in emails

2004-11-26 06:51:44
On Fri, Nov 26, 2004 at 04:27:12AM -0800, Tristan Savatier wrote:

Why don't you show me some analysis

the part:
:(.*[^-a-zA-Z0-9_.])?)

means: a column, followed by any number of any characters, followed
(possibly) by one character not in the set [-a-zA-Z0-9_.]

(ITYM "colon.")  Well, that's not entirely correct.  (After the colon,
I mean.)  The ".*" followed by the lone character not of the set
stated are taken as one unit, surrounded by parens.  It is that one
unit that shall appear zero times or one time (on account of the
question mark).

So the entire grouping can be either (a) nothing; or, assuming only one
char, then (b) a character not of the class stated in the brackets; or, if
more than one char, then (c) anything at all followed by one char not of the
class stated.

IOW, if there is only one char, it would have to be one from the "not me!"
class in brackets.


the set is apparently meant to be the legal set of characters that can be
used in the left part of an email address.

Perhaps, though I don't know that I'd use the words "the legal set."
Rather, I would likely term it as "a convenient. likely-to-be-found
set of chars that would be used in the left part of an email address."

Indeed, in my own private var set that I use to build my .procmailrc,
I find these -- note the comments I've made, and in particular the
second comment:

 2:27pm [~/.procmail/vars] 395[0]> grep -h ALPH * 
 ALPHA         = a-zA-Z0-9          # alphanumeric set
 ADDYCHAR      = $ALPHA.=_+-        # sensible charset for local address part
 HOSTCHAR      = $ALPHA-

So there's your plus sign, and also an equals sign, in my "sensible"
as opposed to "legal" address set.  I think that same word, "sensible,"
is more appropriate to what Stephen probably thought with the ^TO/^TO_
macros.

the idea is that:

^TO_hugs(_at_)foo\(_dot_)com

would match hugs(_at_)foo(_dot_)com but would not match 
bighugs(_at_)foo(_dot_)com ,
big-hug(_at_)foo(_dot_)com or big-hug(_at_)foo(_dot_)com

Okay, but unfortunately "ugs(_at_)foo(_dot_)com", "gs(_at_)foo(_dot_)com", and 
"s(_at_)foo(_dot_)com"
also match, I believe, even if the defect you cite is fixed.

-- 
dman

____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail