procmail
[Top] [All Lists]

Re: RFC-consistent regexp to match name@(subdomain.)*foo.bar

1997-02-19 18:49:45
At 03:50 PM 2/19/97 -0800, Rob Perelman wrote:
On Wed, 19 Feb 1997, Stan Ryckman wrote:

I think you're asking for something *extremely messy* in a regexp.
Consider you'll have to match:
        johndoe(_at_)foo(_dot_)bar
        John Doe<jdoe(_at_)foo(_dot_)bar>
        jdoe(comment)@(anothercomment)  foo (.bar( (fun yet?))) .(hi!) 
(com)bar
        "John (jdoe @ foo . bar )" <jdoe @ foo . bar > ( jdoe @ foo . bar)
        (a) luser (b) @ (c) subdom (d) . (e) foo (f) . (g) bar (h)
but not:
        "John (jdoe(_at_)foo(_dot_)bar )" <jdoe @ notfoo . bar > ( 
jdoe(_at_)foo(_dot_)bar)

Strange...here's my findings.  This is according to a 4 page regex 
written by Tom Christianson, author of Perl.

I have great respect for Tom's work.  The fact that it's "a 4 page regex"
points out the difficulty of getting a procmail regex to do this correctly
(which was my actual point).

I'm going to assume from its name that all this regex does is validate
a *legal* address:

powergrid% rfc822.pl

johndoe(_at_)foo(_dot_)bar
** Above matches? Yes

We agree.

John Doe<jdoe(_at_)foo(_dot_)bar>
** Above matches? Yes

We agree again.

jdoe(comment)@(anothercomment)  foo (.bar( (fun yet?))) .(hi!) (com)bar
** Above matches? No

OK, instead of believing a 4-page perl script, tell me what makes this
not an email address equivalent to jdoe(_at_)foo(_dot_)bar ?

It's possible Tom has a bug -- perhaps he doesn't deal with the nested
comments?  Adjacent comments?  "?" or "!" character?  Those are legal.
I'm sure he'll take a bug report.  (I don't have the script, so don't
want to send it to him, particularly without narrowing it down.)

It's also possible I'm misreading RFC822 -- if so, tell me where without
referencing a perl script.

"John (jdoe @ foo . bar )" <jdoe @ foo . bar > ( jdoe @ foo . bar)
** Above matches? Yes
(a) luser (b) @ (c) subdom (d) . (e) foo (f) . (g) bar (h)
** Above matches? Yes

Agree.  Agree.

"John (jdoe(_at_)foo(_dot_)bar )" <jdoe @ notfoo . bar > ( 
jdoe(_at_)foo(_dot_)bar)
** Above matches? Yes

Ah, but matches *what*?  The original poster wanted to identify
email addresses at foo.bar and its subdomains, not arbitrary
email addresses.  This, while a legal email address (probably
what your perl script is checking) evaluates to jdoe(_at_)notfoo(_dot_)bar,
which is *NOT* desired as a match here.  However,
jdoe(_at_)not(_dot_)foo(_dot_)bar *WOULD* be desired as a match (though I didn't
include it in an example).

I deduce that even 4 pages isn't enough!  :-)

Cheers,
Stan Ryckman (stanr(_at_)tiac(_dot_)net)