procmail
[Top] [All Lists]

"Kernel-unlock failed" - possible mail loss issue?

1997-10-23 15:28:09
I believe I've just seen procmail lose my first piece of email in
about three years. It's my own fault, I'm not blaming procmail, but
I'm trying to understand what's going on. Please include me personally
in any reply - I'm not on the procmail mailing list.

This is for procmail v3.11pre7 running on an Alpha running OSF1 v4.0
(Digital Unix, I believe). It's configured with these locking strategies:
  Locking strategies:     dotlocking, fcntl(), lockf(), flock()

The symptom is an entry like this in the procmail log:

procmail: Kernel-unlock failed
From majordom(_at_)santafe(_dot_)edu  Thu Oct  9 17:37:19 1997
 Subject: Re: One chance to allocate XColormap color, why?
  Folder: swarm.mbox                                                       1881

and the folder swarm.mbox is either empty or nonexistent (not sure which).
My question is, what should procmail's behaviour *be* when that
Kernel-unlock error shows up? It looks like it just assumes the mail
was delivered correctly and exits as if all was well. In my
circumstance, that results in the message being lost.


My circumstance is a bit weird. I believe the culprit is that I
frequently run bash code like this:
 [ \! -s swarm.mbox ] && rm swarm.mbox; ls -l *.mbox
(The purpose of this code is to delete empty mailboxes. It's not an
atomic check, which is why I think this is all ultimately my fault.)

The directory with my email is NFS mounted both on the machine that's
running procmail and the machine that's running that empty mailbox
code. I think there's a race condition happening. My guess is that at
the moment I do the test, the file is in fact empty. But procmail has
it open and writes to it while my bash function is executing, and then
bash removes the file before procmail is done. So procmail then goes
to unlock the file, which now no longer exists, and the kernel unlock
fails.

Again, my question really is can or should procmail do anything to
handle this case? The code in question is in mailfold.c, the function
dump():
 
        int serrno=errno;                      /* save any error information*/
        if(tofile&&fdunlock())
           nlog("Kernel-unlock failed\n");
        SETerrno(serrno);

It looks like the assignment to serrno is to deliberately *ignore* the
error from fdunlock(). Is that the right thing to do?

While I'm here, if someone knows of a safer way to delete 0 length
mailbox files out from under procmail, I'd love to know. I hate NFS.

thanks
  Nelson

<Prev in Thread] Current Thread [Next in Thread>