procmail
[Top] [All Lists]

RE: How to use Procmail to remove messages from server after x numberof days

2011-03-01 11:28:58
At 08:36 2011-03-01, Komal Tagdiwala (ktagdiwa) wrote:
You can use email clients like Thunderbird which can be configured to
PURGE messages older than X number of days.

I'd presume that the OP is doing IT for an office or a small ISP, and wants to limit the amount of stored mail. It would be difficult to consistently enforce the use of a specific email client, and ensure that they're all configured to purge messages, etc.

Besides, there's a HUGE failing in that approach: if someone doesn't log in for a month, their client software isn't connecting and purging the messages at all. So, the ACTIVE users might purge their messages with client configured to do so, but the INACTIVE users continue to have email pile up.

An ISP I adminned for didn't much care how much or how long you stored your email. Beyond a certain amount of disk use, it was all billable storage - if someone wants to leave their email on the server indefinately, then let them - it would reflect in their bill. Obviously, that approach won't work for a corporate environment.

I question the logic of deleting messages > 10 days old though - in a business environment, there's got to me older messages that need to be preserved, and in a customer/user environment, I'd flee the service the instant they purged my email (ok, nevermind that I don't entrust my email to other people's servers anyway).


As to actually purging the messages, you need to determine if the messages are folder or mailbox format. The former would be easier to delete individual messages from, but isn't something you'd use procmail for. You also need to know whether the users might have processes manipulating their mailboxes (shell access, their own procmail), or if it's just the LDA + POP.

You should be able to lock the mailbox and move it, unlock it, then newly arriving email (with is 0 days old) shows up into the normal mailbox while you're reprocessing the old mail. You reprocess the moved mailbox into a new file, discarding messages that meet your criteria (I'd use the date in the From_ line, FWIW), and then when done, lock the main mailbox file, concatenate it (if there's anything in it) to your reprocessed file (no per-message examination necessary), and replace the main mailbox with the reprocessed one, then unlock it. During this time, you won't have limited access for arriving mail for more than a few cycles.

Most of what you need to accomplish is shell scripting. The recipe for procmail to essentially /dev/null messages over 10 days old is near trivial. This here is untested:

# Extract the date from From_, as well as the current time.
# both represented in seconds from epoch.
# you're on your own if there isn't a valid date in the From_ header
# (but WHY would that happen?)
:0
* ^From[        ]+[^    ]+ +\/[^        ].*
{
        FROM_SECS=`date --date "$MATCH" +%s`
        CURRENT_SECS=`date +%s`

        # establish your threshold, in seconds
        # 10 * 24 * 60 * 60 = 864000
        THRESHOLD_SECS=864000

        # use scoring for primitive math evaluations.
        # The current time, minus the original message time is how old the
        # message is (which should be a positive value).  Subtract our
        # threshold from that, and if the threshold is greater than the
        # message age, then this evaluates with a negative result, and we
        # don't discard.  If the threshold is less than the message age
        # (i.e. > 10d), then we have a positive result and evaluatte true
        # -- discarding the message as our action.
        # you could instead store it to a separate mailbox to be handled by
        # the script which invoked procmail.  If you do, be sure to use the
        # locking flag.
        :0
        * $ $CURRENT_SECS^0
        * $ -$FROM_SECS^0
        * $ -$THRESHOLD_SECS^0
        /dev/null
}


However, there's the issue of whether various processes properly lock, or what they do to the mailbox while they're handling it - your POP server for instance may make a copy of the mailbox and DELE from their client software may purge from that copy. Same goes for a shell based MUA, and who knows about web-based access. If you process the user mailbox while it's being manipulated by another process (or rather, while most of its content has been copied), you could have very unpredictable results.

A "simple" software upgrade of a server package could complicate things as well.

---
 Sean B. Straw / Professional Software Engineering

 Procmail disclaimer: <http://www.professional.org/procmail/disclaimer.html>
 Please DO NOT carbon me on list replies.  I'll get my copy from the list.

____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)de
http://mailman.rwth-aachen.de/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>