procmail
[Top] [All Lists]

Re: formail -D generates empty msg. when duplicate found?

1999-12-11 15:11:30
On Dec 11,  1:36pm, Dallman Ross wrote:
Subject: Re: formail -D generates empty msg. when duplicate found?
From: gary(_at_)Intrepid(_dot_)Com (Gary Funck)
I'd read "not output a duplicate message" as meaning "nothing
will be sent to the output".  Also, does a mail message with
only "From " line qualify as a valid mail message?

The problem with sending something through is that it will
confuse simple scripts like this:

   formail -D 10000 id.cache -s echo . < mbox | wc -l

which attempts to count the number of unique messages, but
will actually count all messages, due to the dummy 'From '
being passed through.

Just to clarify: the point of the e-mail above was to show
the usefulness (albeit limited) of having formail _not_ generate
any sort of output if the -D criteria is met.


Are you sure your scheme must rely on counting messages?
Is anyone else doing this?  (I'd like to see the recipes.)
Whu not do something more mundane, if more useful, and count
bytes and delete after a certain limit?

Actually, right now, once a month, I append the current mail list
mbox's to an archive of each, if they exceed a certain byte limit.
The only problem with that approach, is if I want to look at recent
mails, and they've been moved to the archive, I end up opening a
very big mailbox.  What I'd rather do, is just look at the last
N messages, and have those on hand, all the time.  Also, I have
a tendency of keeping a lot of old messages around in my inbox,
'just in case I might need them later', and would like to have
the older messages automatically pruned.  Same goes for my .sent file.

The other difficulty with checking for byte limits, is it is more
difficult to write the script so that it prunes only N bytes off the
front of the mailbox ... the pruning needs to occur at message
boundaries.  Still, it might makes sense to prune on the basis of the
number of bytes, as well as the number of messages, ie: "prune this
mail box until it contains less than 250,000 bytes and less than 100
messages".  That will make the logic a little more complicated though,
because you'd need to count backwards from the most recent message to
ariive at the number of messages to keep.


What I myself am thinking of implementing is a backup
of all non-list mail that would then get cleaned out by a cron job
and find after so many days.  I do this now (have for years) with
my procmail logs and auto-ack respondent database (everybody gets
an auto-ack no more than weekly from each unique sending address,
unless they request verbose acks or no acks).  It would be simple
to add saving mail to files each day and then use find
to delete mail over, say, 7 days old.  I would also gzip the
mail.

If I understand your suggestion above, this would require saving
each e-mail message into a separate file?  I was trying to avoid
that.  I prefer keeping related mail in a single file.