procmail
[Top] [All Lists]

Re: RFC-consistent regexp to match name@(subdomain.)*foo.bar

1997-02-19 15:52:54
At 10:08 AM 2/19/97 PST, John R. Ruckstuhl wrote:
What is the regexp to match a set of email addresses within foo.bar like
   name@(subdomain.)*foo.bar

I'm using
   [a-z][^     ]*@([a-z][a-z0-9_-]*\.)*foo\.bar

i.e., name starts with an alpha and contains anything but white-space,
and subdomain starts with an alpha and contains only alphas, numerals,
and underscores & dashes.

But I suspect my assumptions are inaccurate regarding what is an
RFC-compliant, legitimate, email address.
Can someone supply a more appropriate regexp for matching email 
addresses within foo.bar and her subdomains?

I think you're asking for something *extremely messy* in a regexp.
Consider you'll have to match:
        johndoe(_at_)foo(_dot_)bar
        John Doe<jdoe(_at_)foo(_dot_)bar>
        jdoe(comment)@(anothercomment)  foo (.bar( (fun yet?))) .(hi!) (com)bar
        "John (jdoe @ foo . bar )" <jdoe @ foo . bar > ( jdoe @ foo . bar)
        (a) luser (b) @ (c) subdom (d) . (e) foo (f) . (g) bar (h)
but not:
        "John (jdoe(_at_)foo(_dot_)bar )" <jdoe @ notfoo . bar > ( 
jdoe(_at_)foo(_dot_)bar)

The mind boggles...

I'm sure it can even get worse.  :-)

I don't have the parsing in RFC 822 memorized, but that's where to look.

Cheers,
Stan