procmail
[Top] [All Lists]

Re: How to use Procmail to remove messages from server after x numberof days

2011-03-01 13:16:09
Hi Sean,

Thanks for taking the time to reply in such detail. You are right about our situation, we have a website that is shared with a group of companies all around Asia. We give each member web space and email for conformity, and some users just let the mail pile up and go over our disk space quota.

I need to limit it without mail getting bounced or rejected.

Your points are well taken and some of the uncertainty of what may or may not work as planned makes me want to shy away from the procmail option.

Thanks again, lot's of great info here.

On 3/1/2011 11:42 PM, Professional Software Engineering wrote:
At 08:36 2011-03-01, Komal Tagdiwala (ktagdiwa) wrote:
You can use email clients like Thunderbird which can be configured to
PURGE messages older than X number of days.

I'd presume that the OP is doing IT for an office or a small ISP, and
wants to limit the amount of stored mail. It would be difficult to
consistently enforce the use of a specific email client, and ensure that
they're all configured to purge messages, etc.

Besides, there's a HUGE failing in that approach: if someone doesn't log
in for a month, their client software isn't connecting and purging the
messages at all. So, the ACTIVE users might purge their messages with
client configured to do so, but the INACTIVE users continue to have
email pile up.

An ISP I adminned for didn't much care how much or how long you stored
your email. Beyond a certain amount of disk use, it was all billable
storage - if someone wants to leave their email on the server
indefinately, then let them - it would reflect in their bill. Obviously,
that approach won't work for a corporate environment.

I question the logic of deleting messages > 10 days old though - in a
business environment, there's got to me older messages that need to be
preserved, and in a customer/user environment, I'd flee the service the
instant they purged my email (ok, nevermind that I don't entrust my
email to other people's servers anyway).


As to actually purging the messages, you need to determine if the
messages are folder or mailbox format. The former would be easier to
delete individual messages from, but isn't something you'd use procmail
for. You also need to know whether the users might have processes
manipulating their mailboxes (shell access, their own procmail), or if
it's just the LDA + POP.

You should be able to lock the mailbox and move it, unlock it, then
newly arriving email (with is 0 days old) shows up into the normal
mailbox while you're reprocessing the old mail. You reprocess the moved
mailbox into a new file, discarding messages that meet your criteria
(I'd use the date in the From_ line, FWIW), and then when done, lock the
main mailbox file, concatenate it (if there's anything in it) to your
reprocessed file (no per-message examination necessary), and replace the
main mailbox with the reprocessed one, then unlock it. During this time,
you won't have limited access for arriving mail for more than a few cycles.

Most of what you need to accomplish is shell scripting. The recipe for
procmail to essentially /dev/null messages over 10 days old is near
trivial. This here is untested:

# Extract the date from From_, as well as the current time.
# both represented in seconds from epoch.
# you're on your own if there isn't a valid date in the From_ header
# (but WHY would that happen?)
:0
* ^From[ ]+[^ ]+ +\/[^ ].*
{
FROM_SECS=`date --date "$MATCH" +%s`
CURRENT_SECS=`date +%s`

# establish your threshold, in seconds
# 10 * 24 * 60 * 60 = 864000
THRESHOLD_SECS=864000

# use scoring for primitive math evaluations.
# The current time, minus the original message time is how old the
# message is (which should be a positive value). Subtract our
# threshold from that, and if the threshold is greater than the
# message age, then this evaluates with a negative result, and we
# don't discard. If the threshold is less than the message age
# (i.e. > 10d), then we have a positive result and evaluatte true
# -- discarding the message as our action.
# you could instead store it to a separate mailbox to be handled by
# the script which invoked procmail. If you do, be sure to use the
# locking flag.
:0
* $ $CURRENT_SECS^0
* $ -$FROM_SECS^0
* $ -$THRESHOLD_SECS^0
/dev/null
}


However, there's the issue of whether various processes properly lock,
or what they do to the mailbox while they're handling it - your POP
server for instance may make a copy of the mailbox and DELE from their
client software may purge from that copy. Same goes for a shell based
MUA, and who knows about web-based access. If you process the user
mailbox while it's being manipulated by another process (or rather,
while most of its content has been copied), you could have very
unpredictable results.

A "simple" software upgrade of a server package could complicate things
as well.

---
Sean B. Straw / Professional Software Engineering

Procmail disclaimer: <http://www.professional.org/procmail/disclaimer.html>
Please DO NOT carbon me on list replies. I'll get my copy from the list.

____________________________________________________________
procmail mailing list Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)de
http://mailman.rwth-aachen.de/mailman/listinfo/procmail




--
Best Regards,

Tim Rice
Computer Stuff
Phuket Thailand 83000
Tel: +66 76 376165
Fax: +66 76 376165
www.computerstuff.net
www.phuket-mail.com
____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)de
http://mailman.rwth-aachen.de/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>