procmail
[Top] [All Lists]

Regexp efficiency (Re: Never send mail to /dev/null?)

1997-04-16 04:39:00
Kirk Job Sluder <csluder(_at_)indiana(_dot_)edu> wrote:
On Tue, 15 Apr 1997, era eriksson wrote:
As an aside, I can't see how it could +hurt+ to change that .* into
(.*\<)? -- a better bounded search is also more efficient, no? 

Could you explain how this is more efficient?  As I read this (with by
admission the eyes of a novice) I think it should try to satisfy the greedy
.* by going to the end of the line, then work backwards character by
character until it meets a "<".  (I'm also not sure why the left angle
bracket is escaped here.) 

First: < and \< are different things (man procmailrc).  But that's beside
the point.

The procmail regexp engine does not go all the way to the right and then
backward to find the match, it goes forward only (so it never arrives
at the right, unless, by then, no match has been found).
As for efficiency, on average .* and (.*\<)? will be equally fast (at
least in procmail), unless the regexp following the .* has multiple
branches again.

I.e.    .*(hello|there)         is slower than          (.*\<)?(hello|there)

-- 
Sincerely,                                                          
srb(_at_)cuci(_dot_)nl
           Stephen R. van den Berg (AKA BuGless).

"To err is human, to debug ... divine."

<Prev in Thread] Current Thread [Next in Thread>