procmail
[Top] [All Lists]

Re: pruning mbox's based on number of messages?

1999-12-11 13:37:39
Gary wrote,

| But this still leaves the question: how many messages are in
| the mailbox?  So far, I've thought of two approaches:
| 
| The first approach is rather slow - it splits all the messages
| and runs them through a shell script whose only purpose is
| to print the value of $FILENO.  The last value is extracted
| with 'tail', and then 'sed' removes the leading zeros:
|    set N = `setenv FILENO 000001; \\
|             formail -s csh -c 'echo $FILENO' < mailbox | \\
|             tail -1 | sed -e 's/^[0]*//'`
| but this approach is slow, and rather complicated.

You could just leave the zeroes on and skip the sed call, or you could setenv
FILENO 1 (formail will widen the field as needed) to start and not get the
leading zeroes.  Some other ways to get it out, cheaper than running csh for
every message, are

  N=`setenv FILENO 1 ; formail -s printenv FILENO < mailbox | tail -1`

or

  N=`formail -s echo . < mailbox | wc -l` # no need to use FILENO

| The second approach is simpler and much faster:
|    set N = `grep -c '^From ' mailbox`
| but may not be completely reliable because it doesn't decode MIME
| attachments and such, which I suppose could contain lines that
| begin with 'From '.

Also, messages whose bodies are protected with Content-Length: headers might
have body lines beginning with "From ", which would fool grep as well.

| Once N is known, will the following work?

I don't know beans about csh, but it looked correct to me.