Hello Bennet and the others,
"BT" == Bennett Todd <bet(_at_)newritz(_dot_)mordor(_dot_)net> writes:
BT> 1999-05-18-12:37:52 Gjermund Sørseth:
Consider how procmail dot-locks a mailbox - first it creates a file
with a unique name, something like /var/mail/_QVETS. Then it tries to
hardlink /var/mail/user.lock to this file. If the user.lock file already
exists (because the user is already receiving some mail), then procmail
unlinks the temporary file, sleeps for 8 seconds and tries again.
After running a number of trusses on both sendmail and procmail to see
what was going on, and now I see that your suggestions are all
straight on track! I was trying to debug the slowness of our mail
system, and there were 6 procmails awaiting a particular user. I
intercepted 3 of these with truss (which monitors Solaris system
calls), and found 200 to 300 failed attempts to
This implies an equivalent number of opens on /var/mail/_auDQ.irit as
well as unlinks of the same file.
BT> And the remaining problem is, if you have thousands of procmails stacking up
BT> like airplains outside Newark, the directory will grow large with dotlock
BT> files, and large directories introduce hideous delays under many OSes. Aside
BT> from trying not to create the dotlock file until you've gotten a kernel lock
BT> (if available) the only other fix has to lie outside of procmail, namely fix
BT> the system to use a better-scaling filesystem, one whose performance doesn't
BT> degrade so viciously as the number of directory entries grows large.
BT> Reiserfs claims to be one such, though I haven't tried it out. NetApps'
BT> WAFL is another, and SGI has a third, and that's all the alternatives I've
BT> heard about.
BT> One more thought about leaving that dotlock file around while waiting to
BT> acquire; it'd be a kindness to install a signal catcher for some common
BT> possibilities like INTR, HUP, and TERM, to clean up the scratch file before
I haven't looked tat the source code for this, but when I killed a
procmail the dot.lock file did disappear for a split second, before
some other procmail would create it.
There is still the problem of:
fcntl(8, F_SETLKW, 0x00032FB0) (sleeping...)
ALL my suspended procmail's are in this state, if they are not
awaiting the link of the dot.lock file. So there still seems to be
another race condition.
Dr. Ralph P. Sobek Disclaimer: The above ruminations are my own.
Ralph(_dot_)Sobek(_at_)irit(_dot_)fr Addresses are
ordered by importance.
sobek(_at_)irit(_dot_)fr If all
else fails, try:
Ph:(+33)561558618 FAX:(+33)561556258 http://www.irit.fr/~Ralph.Sobek/
Urgent!! Greenhouse Effect: http://www.irit.fr/~Ralph.Sobek/greenhouse.html