procmail
[Top] [All Lists]

[era(_at_)iki(_dot_)fi: Re: how to match against first N lines/bytes of body?]

1997-11-14 08:05:46
Youch, missent this to Spam-L by mistake. I guess I'm losing the grip ...
This is a bit old already but the efficiency issues are IMHO relevant.

/* era */

------- Start of forwarded message -------
Date: 14 Nov 1997 16:19:41 +0200
To: spam-l(_at_)peach(_dot_)ease(_dot_)lsoft(_dot_)com
Subject: Re: how to match against first N lines/bytes of body?
From: era eriksson <era(_at_)iki(_dot_)fi>
X-Pgp-Fingerprint: 85 40 A8 84 71 26 2C ED  15 CF 22 82 91 01 D4 42

On Fri, 14 Nov 1997 08:03:19 -0600 (CST), "David W. Tamkin"
<dattier(_at_)miso(_dot_)wwa(_dot_)com> wrote:
When I suggested this for Adam Grove's question,
|  >   :0Bi
|  >   toplines=| head -$N # sed ${N}q if you don't have head
| Can't you just do a MATCH grab instead of run an external process? 
| The regex to grab fifty lines (or all of them, if there are less than
| fifty) is not going to be very pretty, but it should be a lot more
| efficient. 
That was the very reason: the regexp would be too difficult to type in and
too difficult to edit if it needed changing or correcting, and I felt that
that outweighed the savings of not forking head.  In my own words that you

But Adam specifically asked for an efficient solution ... I thought
there was something I had overlooked. Oh well, yet another case where 
(.*$)\{0,50\} would be a very welcome form of syntactic sugar. Anyhow,
while we're holding our breath waiting for Stephen to implement that,
it wouldn't seem unreasonable to use, say, a macro processor to
produce the recipe, or just make it regular enough to make editing it
not be a nightmare. 

I tried a simple timing test but the changes are so small -- a MATCH
grab gets 0.1/0.0/0.0 while a head gets 0.2/0.0/0.0 :-)

You don't need the first asterisk in ([      ]*|$)*.  Personally, I write
( |  |$)* without brackets either to match any run of whitespace.

Is there not an efficiency penalty for invoking the parens again and
again? (Perhaps Procmail's non-greedy behavior will invoke them every
time regardless?)

/* era */

- -- 
 Paparazzi of the Net: No matter what you do to protect your privacy,
  they'll hunt you down and spam you. <http://www.iki.fi/~era/spam/>
------- End of forwarded message -------

<Prev in Thread] Current Thread [Next in Thread>