Re: Long line of OR

At 07:16 2011-10-12, LuKreme wrote:

* 9876543210^0 ! VAR ??(aa|apsumo|beni|candy|casa|cf|cpr|db|dbl|dhosts|fb|glue|gofobo|hita|ic|llc|logi|micr|netf|ocb|plus|pl|rcc|rk|rocb|scv|snap|spot|su|tor|tripa|twit|v365|vm|root|admin|rsmith|bsmith)
and it occurred to me there might be a reason not to do that.

Questions generally end with a query mark (?). I take it this posedas a question?

You're using maximal scoring, so as soon as the regexp evaluatestrue, it'll evaluate the condition as true and skip all otherscorings. However, the entire regexp has to be parsed. You MIGHTget somewhat better performance by splitting it into multiple scoringlines, with more frequently encountered elements in the earlierconditions, but I expect you'd need to run a bunch of benchmarkedtests to see a timing difference. The length of VAR will no doubthave an impact.


You may however see more of an improvement by optimizing your regexp:

(a(a|psumo|dmin)|b(eni|smith)|c(andy|asa|f|pr)|d(b|bl|hosts)|fb|g(lue|ofobo)|hita|ic|l(lc|ogi)|micr|netf|ocb|p(lus|l)|r(smith|oot|cc|k|ocb)|s(cv|nap|pot|u)|t(or|ripa|wit)|v(365|m))

Readability takes a big dive though. Yes, there are some otheroptimizations tat could be made, but that's the broad strokes.

Again, youd need to run benchmarks - put recipes in a sandbox, runagainst a large corpus several times, omitting the longest andshortest times (owing to the effects of cache, etc), and would needto run it on a host which didn't have other demands on it.


---
 Sean B. Straw / Professional Software Engineering

 Procmail disclaimer: <http://www.professional.org/procmail/disclaimer.html>
 Please DO NOT carbon me on list replies.  I'll get my copy from the list.

____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)de
http://mailman.rwth-aachen.de/mailman/listinfo/procmail