Walter Dnes <waltdnes(_at_)interlog(_dot_)com> writes:
ACK!!! Make that pattern...
* ^Message-Id:.*[<](_dot_)(_dot_)*(_at_)+\/.*\.
That's equivalent to:
* ^Message-Id:.*<(_dot_)+(_at_)\/.*\.
If there really was more than one '@', the match would still include
the second and later ones, as the '+' to the left of the \/ token is
not greedy. Since you can't have two '@'s in a Message-Id: unless one
or both are quoted, I wouldn't worry about it.
Or if you really do feel like worrying, use a _real_ message-id regexp:
* ^Message-Id:[ ]*<[ ]*("([^"\]|\\.)*"|[-!#-'*+/-9=?A-Z^-~]+)\
([ ]*\.[ ]*("([^"\]|\\.)*"|[-!#-'*+/-9=?A-Z^-~]+))*\
[ ]*(_at_)[ ]*\
(\[[ ]*([^][\]|\\.)*[ ]*\]|\
[-!#-'*+/-9=?A-Z^-~]+([ ]*\.[ ]*[-!#-'*+/-9=?A-Z^-~]+)*)\
[ ]*>
If you just want to extract the host, strip the trailing whitespace and '>',
and insert the \/ token:
* ^Message-Id:[ ]*<[ ]*("([^"\]|\\.)*"|[-!#-'*+/-9=?A-Z^-~]+)\
([ ]*\.[ ]*("([^"\]|\\.)*"|[-!#-'*+/-9=?A-Z^-~]+))*\
[ ]*(_at_)[ ]*\
\/(\[[ ]*([^][\]|\\.)*[ ]*\]|\
[-!#-'*+/-9=?A-Z^-~]+([ ]*\.[ ]*[-!#-'*+/-9=?A-Z^-~]+)*)
Philip Guenther