procmail
[Top] [All Lists]

Re: filter number(_at_)foobar(_dot_)com

1999-09-04 15:47:36
"David W. Tamkin" <dattier(_at_)Mcs(_dot_)Net> writes:
Philip answered the very question a few hours ago:

 * ^TO_[1-9](((([0-9][0-9]?)?[0-9])?[0-9])?[0-9])?(_at_)good\(_dot_)domain

Well, OK, his advice would really turn out like this:

 * ^TO_[1-9]([0-9]([0-9]([0-9]([0-9][0-9]?)?)?)?)?(_at_)good\(_dot_)domain

but the two should be equivalent and I thought the former was easier to type.


Equivalent in effect, though not it speed.  The former requires the regexp
engine to follow all the possibilities at once, while the latter only
requires it to follow at most one.  Consider the following header line:

        To: 12345(_at_)good(_dot_)domain

The regexp engine examines this one character at a time.  After it sees
the "To: 1" it will have matched up to the "^TO_[1-9]" in the regexp.
It now sees the '2'.  With the former regexp, that '2' might end up
matching the 1st, 3rd, 4th, or 5th of the "[0-9]" blocks, depending
on how many more numbers are present, so procmail has to keep track of
all those possibilities until later characters rule them out.  With the
later regexp, that '2' can only match the first of the "[0-9]" blocks.
The same goes for the '3', '4', and '5'.


Philip Guenther

<Prev in Thread] Current Thread [Next in Thread>