At 00:41 2003-02-07 +0100, Ruud H.G. van Tol wrote:
Try to detect that the address is not yet completed (@hotmail.com.tw,
@hotmail.com_is_not.com). That is certainly the case when a character
from [-+a-z0-9.$_~] follows. So assume it's ended when a-character-not-
from-that-set (or a newline) follows. A regexp for that is
([^-+a-z0-9.$_~]|$), which should be appended to the condition:
Uhm, when did $, ~, and + make their way into RFC1035?
Ask yourself, when was the last time you saw one of these characters in a
(valid, non-spam) domain name?
Dot is a valid separator between host tokens, and the aforementioned DNS
host naming specification allows for A-Z (upper and lower, with no
difference attributed to case), 0-9, and a hyphen. That's it.
I included underscore only because they're allowed for some types of DNS
records - but, actually, they're _ILLEGAL_ for DNS hostnames, and therefore
should never appear in a valid RHS of an email address. Therefore, that
should be excluded from the RHS regexp, which becomes:
([^-.a-z0-9]|$)
Arguably, if you have anything other than ([>") ]|$) following an
address, it's probably a bogus string you're matching against (some hibit
text perhaps?).
But then 'From:email2(_at_)hotmail(_dot_)com' would lead to Spam, so you need
(.*[^-+a-z0-9.$_~]|), which makes the condition:
If I received any mail that did not have a whitespace between the header
and the data in that header, I'd be happy to "misclassify" it as junk just
the same.
Maybe you will often get away with:
* ! ^From:(.*\<|)(email1|email2|friend)@hotmail\.com(\>)
\< is incorrect in this context anyway. "firstname(_dot_)lastname(_at_)domain" could
be matched with your expression if the email address you were looking for
was "lastname(_at_)domain", because ".*\<" would match everything up to and
including the dot.
Are you _likely_ to run into someone mailing you junk from an address that
similar to someone you know? No, but that'll be little consolation when
things break when it does happen.
---
Sean B. Straw / Professional Software Engineering
Procmail disclaimer: <http://www.professional.org/procmail/disclaimer.html>
Please DO NOT carbon me on list replies. I'll get my copy from the list.
_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail