procmail
[Top] [All Lists]

Re: Counting lines

2000-02-25 14:55:22
Dr. Daia asked,

|     I have a recipe that is supposed to count the lines in the incoming
| messages:
| 
|       :0 Bfh
|       * H ?? ! ^Lines:
|       * -1^0
|       *  1^1   ^.*$
|       |formail -A "Lines: $="
| 
| This should work, right?  Well, I've been happy with it for years, until
| I discovered that it counts one too _many_ lines.  Any idea what's going
| on here?

Even after -1^0 it still counts one line too many?  Or are you asking why
it needs the -1^0?  The former baffles me; the latter I can answer.

When procmail is counting matches to an expression in a weighted recipe,
if the matched area ends in a newline, it backs up one byte to search for
the next appearance.  That's designed so that, when the expression is some-
thing like ^whatever$, the newline at the end of one line of text can serve
as both the $ of the line before it and the ^ of the line after it, because
^ and $ have to match actual newlines and not the beginnings and ends of
lines.

For the same reason, procmail always imputes an extra newline (the putative
newlines, we call them) immediately before and immediately after the search
area.  That's so that ^stuff can match at the start of the area even if there
is no literal newline there or stuff$ at the end.

So when you are counting lines with 1^1 ^.*$, procmail keeps backing up one
byte so that it can reuse the newline that was the previous line's $ as the
current line's ^.  Finally, at the last line, it matches the newline at the
end of it to $.  But then it backs up and matches
<closing real newline><nothing><putatitve newline> to ^.*$ and counts one
line too many.

Now, Liviu, if you knew all that and your question is why procmail is still
counting one line too many even after -1^0 and you're going to have to change
it to -2^0, then I'm baffled.

As to Bennett's theory that it is counting the blank line between the head
and the body, it never did before; has that changed, Philip?  Or are you
suddenly receiving mail that has two blank lines between the head and the
body, such that the second one is legitimately part of the body but you don't
want to count it?

Hmm.  That last one can be tested for easily.  Try this:

  :0fh
  * ! H ?? ^Lines:
  * B ?? ()\/.+$(.*$)*
  * 1^1 MATCH ?? ^.*$
  * -1^0
  | formail -A "Lines: $="

<Prev in Thread] Current Thread [Next in Thread>