At 03:50 PM 2/19/97 -0800, Rob Perelman wrote:
On Wed, 19 Feb 1997, Stan Ryckman wrote:
I think you're asking for something *extremely messy* in a regexp.
Consider you'll have to match:
johndoe(_at_)foo(_dot_)bar
John Doe<jdoe(_at_)foo(_dot_)bar>
jdoe(comment)@(anothercomment) foo (.bar( (fun yet?))) .(hi!)
(com)bar
"John (jdoe @ foo . bar )" <jdoe @ foo . bar > ( jdoe @ foo . bar)
(a) luser (b) @ (c) subdom (d) . (e) foo (f) . (g) bar (h)
but not:
"John (jdoe(_at_)foo(_dot_)bar )" <jdoe @ notfoo . bar > (
jdoe(_at_)foo(_dot_)bar)
Strange...here's my findings. This is according to a 4 page regex
written by Tom Christianson, author of Perl.
I have great respect for Tom's work. The fact that it's "a 4 page regex"
points out the difficulty of getting a procmail regex to do this correctly
(which was my actual point).
I'm going to assume from its name that all this regex does is validate
a *legal* address:
powergrid% rfc822.pl
johndoe(_at_)foo(_dot_)bar
** Above matches? Yes
We agree.
John Doe<jdoe(_at_)foo(_dot_)bar>
** Above matches? Yes
We agree again.
jdoe(comment)@(anothercomment) foo (.bar( (fun yet?))) .(hi!) (com)bar
** Above matches? No
OK, instead of believing a 4-page perl script, tell me what makes this
not an email address equivalent to jdoe(_at_)foo(_dot_)bar ?
It's possible Tom has a bug -- perhaps he doesn't deal with the nested
comments? Adjacent comments? "?" or "!" character? Those are legal.
I'm sure he'll take a bug report. (I don't have the script, so don't
want to send it to him, particularly without narrowing it down.)
It's also possible I'm misreading RFC822 -- if so, tell me where without
referencing a perl script.
"John (jdoe @ foo . bar )" <jdoe @ foo . bar > ( jdoe @ foo . bar)
** Above matches? Yes
(a) luser (b) @ (c) subdom (d) . (e) foo (f) . (g) bar (h)
** Above matches? Yes
Agree. Agree.
"John (jdoe(_at_)foo(_dot_)bar )" <jdoe @ notfoo . bar > (
jdoe(_at_)foo(_dot_)bar)
** Above matches? Yes
Ah, but matches *what*? The original poster wanted to identify
email addresses at foo.bar and its subdomains, not arbitrary
email addresses. This, while a legal email address (probably
what your perl script is checking) evaluates to jdoe(_at_)notfoo(_dot_)bar,
which is *NOT* desired as a match here. However,
jdoe(_at_)not(_dot_)foo(_dot_)bar *WOULD* be desired as a match (though I didn't
include it in an example).
I deduce that even 4 pages isn't enough! :-)
Cheers,
Stan Ryckman (stanr(_at_)tiac(_dot_)net)