Hi all,
I mentioned a problem to this mailing list a while back, and got
some pretty good information back concerning the problem.
Basically, we are seeing situations where procmail will hang for
a given account. When the first daemon hangs, all subsequent
daemons that get started for that account also hang. Eventually,
this causes our mail server to start to reject mail because of
system load average due to all the stuck procmails running for the
given account.
Here's the configuration:
Mail Server:
- Challenge L
- IRIX 6.2
- 512MB memory
- August 1st recommended/required SGI patch set is installed.
- Account HOME directories are NFS mounted to this system. The
HOME directories mostly come from Sun 2.4/2.5 servers.
The HOME disks are mounted using NFS2.
# procmail -v
procmail v3.11pre7 1997/04/28 written and created by Stephen R. van den Berg
<srb(_at_)cuci(_dot_)nl>
Submit questions/answers to the procmail-related mailinglist by sending to:
<procmail(_at_)informatik(_dot_)rwth-aachen(_dot_)de>
And of course, subscription and information requests for this list to:
<procmail-request(_at_)informatik(_dot_)rwth-aachen(_dot_)de>
Locking strategies: dotlocking, fcntl(), lockf(), flock()
Default rcfile: $HOME/.procmailrc
Your system mailbox: /var/mail/root
If you noticed, we have all the various "locking" mechanisms built into
procmail.
In this configuration, we are seeing problems with locking of the LOGFILE.
Ex.)
---.procmailrc---
LOGFILE=$HOME/.procmail/procmail.log
From the mail server system a given "procmail.log" file will become
inaccessible.
The procmail daemons that are running for the given account become hung. You
can't
send them any kill signal. The processes can't be killed without a system
reboot.
The other thing that is strange is a "ls -l $HOME/.procmail/procmail.log" will
also
hang when in this state. It can't be killed either. You can do commands
against
other files in $HOME/.procmail. You just can't run any command that touches the
procmail.log file.
I know that you can use LOCKFILE or "local lock files(:)" for recipes.
What type locking is done for the "LOGFILE"?
Does the locking depend on the "Locking strategies" that procmail was built
with?
Sometimes we can correct the situation by going to the NFS server , and moving
procmail.log to procmail.log.old. The processes that were hung up continue to
be
hung, but new mail for the given account processes fine. After doing this move,
the hung procmail processes eventually free up and die off(sometimes) thus
preventing a system reboot.
This has been a real headache for us. We had this problem shortly after going
to
IRIX 6.2 a year ago. The problem subsided after installing SGI's recommended
patch
set late last year. We have recently started to see the problem again. It
affects
different accounts at different times.
The problem appears it may be related to a full HOME disk situation.
Basically, we
have noticed a trend that the procmails may get hung up when a disk becomes
full.
What should happen if a disk is full, and procmail can't append to LOGFILE?
What should happen if a disk is full, and procmail can't deliver to the users
mail
folder on the disk?
We have gotten around this issue by turning off LOGFILE which seems to have
helped.
Anyone else seen this problem before?
We are pursuing it with SGI, but it's hard to get a handle on since you can't
reproduce the problem upon request. The fact that a "ls -l procmail.log" hangs
is really disturbing. SGI wants to point to this problem as a procmail issue.
Thanks in advance,
Steve Kelley
steve(_dot_)kelley(_at_)sdrc(_dot_)com
vcard.vcf
Description: Card for Steve Kelley