procmail
[Top] [All Lists]

Re: RegEx in Procmail

2003-03-20 15:35:41
On 20 Mar, Tom Most wrote:
| I am trying to filter email addresses which are sent to a domain in the 
| format of 1-20 lower case or numeric characters, a dot, one or two 
| numeric characters, then ending with @domain.tld.  The regular 
| expressions with which I am familiar would dictate that the filter would 
| look like:
| 
| * ^TO_[a-z0-9-]{1,20}+(\.[0-9-]{1,2}+)@domain\.tld
| 
| However, this does not seem to work.  Any ideas of the source of the 
| problem?

1. procmail doesn't support the {n,m} syntax.
2. Even if it did, you have a superfluous trailing + after each.
3. Your character classes include "-" in addition to alphanumerics.
4. procmail is case-insensitive by default.

I'm assuming your description is correct and the "-" in each character
class was inadvertent.  If not, add them back in as needed.

:0 D
* ^TO_[a-z0-9]+\(_dot_)[0-9][0-9]?(_at_)domain\(_dot_)tld

This is one or more of the first pattern ( i.e. the equivalent of
[a-z0-9]{1,} ).  If you really want to limit it to no more than 20 of
the first pattern, then they'll need to be specified like:

:0 D
* ^TO_[a-z0-9][a-z0-9]?[a-z0-9]?[a-z0-9]?[a-z0-9]?  etc.

If that's the case, it might be easier for line-wrapping, mainteneance
etc. to use variables.

c='[a-z0-9]'
cX9="$c?$c?$c?$c?$c?$c?$c?$c?$c?"

:0 D
* $ ^TO_$c$c?$cX9$cX9\(_dot_)[0-9][0-9]?(_at_)domain\(_dot_)tld

As Sean pointed out, this will force matching of domain.tld to lower
case also.

-- 
Email address in From: header is valid  * but only for a couple of days *
This is my reluctant response to spammers' unrelenting address harvesting



_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>