dattier(_at_)wwa(_dot_)com (David W. Tamkin) writes:
Of late there have breen several posts mentioning as gospel that searching
the body is more work for procmail than searching the head is.
Hmm, I must have missed those.
I can see some reasons: usually the body is larger than the head, so there is
more text to look through; most head searches are anchored left, so if the
Okay, I'll buy that.
text at the start of a line doesn't match, procmail can jump ahead to the
next newline instead of plodding through the entire line; and since a filter
Let me repeat: procmail's regexp engine has no special optimization for
anchoring against the beginning of the line. Most program that have
such an optimization have it because they need the line distinction for
other reasons (for example, grep by default prints the entire line
containing a match). Procmail has no such other reason, so it treats
newline like any other plain character in the regexp. There should be no
speed difference as long as procmail can say: "the first character I see
must be a 'foo'". Note that case insensitivity is handled by making
everything lowercase, so a letter being first doesn't bring in the spectre
of charater-classes or anything like that.
recipe may have just changed the size of the head, procmail cannot keep a
byte-count pointer nor a line-count pointer to where the body begins but must
scan through the head to find the blank line at the neck before it begins a
body search.
Procmail does this when it reads in the head, not when it goes to search
the body, so that cost can't be avoided.
But I got the feeling that there's more to it than that. Are there yet
other factors?
I sure hope the evangelist of "body searches are slow" will come forward
and give other reasons, because the only one I'll take so far is the size
issue, and well, that's just a fact of life.
Philip Guenther