procmail
[Top] [All Lists]

Re: how to match against first N lines/bytes of body?

1997-11-14 07:12:00
When I suggested this for Adam Grove's question,

|  > But if N=50, to heck with it:
|  >   :0Bi
|  >   toplines=| head -$N # sed ${N}q if you don't have head
|  >   :0a
|  >   * toplines ?? pattern
|  >   whatever

Era Eriksson,

| Can't you just do a MATCH grab instead of run an external process? 
| The regex to grab fifty lines (or all of them, if there are less than
| fifty) is not going to be very pretty, but it should be a lot more
| efficient. 

That was the very reason: the regexp would be too difficult to type in and
too difficult to edit if it needed changing or correcting, and I felt that
that outweighed the savings of not forking head.  In my own words that you
quoted, Era, a regexp was fine for testing the top six lines, but for fifty,
"to heck with it."  It also saves LINEBUF-related troubles, but that's minor.

|     :0B
|     * ^^([    ]*|$)*\/[^      ].*$(.*$)?(.*$)? ... etc, another 47 of'em
|     { toplines="$MATCH" }

You don't need the first asterisk in ([         ]*|$)*.  Personally, I write
( |     |$)* without brackets either to match any run of whitespace.

| You should probably set LINEBUF
| reasonably high (at the very least 80*50 = 4000 bytes; probably
| setting it to 8192 or 16384 is a good idea while you're at it) in
| order to avoid trouble.

Yes.