procmail
[Top] [All Lists]

Re: Efficient use of OR-matches and $MATCH

2000-02-07 12:22:26
* Mon 2000-01-31 Ralph SOBEK <sobek(_at_)irit(_dot_)fr> list.procmail
* Message-Id: 
<14485(_dot_)38958(_dot_)490565(_dot_)756012(_at_)gargle(_dot_)gargle(_dot_)HOWL>
| First off, Happy New Year 2000, folks!
| 
| I have some questions concerning the regexp matching process, and
| would like to do it efficiently since I have 1000-2000 messages per
| week which get analyzed by the recipe.  I have looked at Jari's
| Procmail resources, but did not find what I wanted.
| 
| I have a recipe that starts like this:
| 
| :0
| * 9876543210^0 $ ^(From|Subject):.*\/\<(${regexps})s?\>.*$
| * 9876543210^0 B ?? $ ()\/\<(${regexps})s?\>.*$
| {
|    EXPR = $MATCH
| 
|    ...
| }
| 
| Does the huge $regexp serve any purpose?  Should I just have `k'
| independant recipes?  Would this work?  I currently actually have:
| 
| :0
| * 9876543210^0 $ ^(From|Subject):.*\/\<(${regexps})s?\>.*$
| * 9876543210^0 B ?? $ ()\/\<(${regexps})s?\>.*$
| {
|    EXPR = $MATCH
| 
|    <122 * ! ... negative conditions>
| }
| 
| This is actually overkill, since some conditions depend upon the value
| in  (or matches) $EXPR.

From my experience with perl and Emacs REGEXP engine, the simple regexp
matches are very efficient and fast. If you add grouping modifiers 
like (xxxx)? and (xxxx)* (xxx)+ then it might get tad slow easily.

In your case, if I understand, you only have simple OR matches.

This is as fast as it gets and the regexp size does not really matter.
They are constant string and if procmail uses DFA engine, the first one wins
and the regexp scanning is stopped.

So, I would say that there is nothing to optimize here, not atleaset with 
separate recipes. It might give you feew microseconds in some cases, but
it's not noticeable enough to make is justifiable to use several
no-so-maintenable recipes.

Philip -- Any change to get a TIMING_ON and TIMING_OFF constants to progrmail
and TIMING_VALUE value containing the wall clock time (microseconds?) procmail 
spent
during region under test? Currently there is no good way to find how long
it takes to process a recipe.



jari


<Prev in Thread] Current Thread [Next in Thread>