[fetchmail]I'm thinking about rewriting the locking code

Johan Vromans recently wrote this:

From "design-notes.html", section "Multiple concurrent instances of
fetchmail":

   Fetchmail locking is on a per-invoking-user because finer-grained
   locks would be really hard to implement in a portable way. The
   problem is that you don't want two fetchmails querying the same
   site for the same remote user at the same time.

Locking on a per-invoking-user base only prevents this for the cases
where the same user would invoke multiple fetchmails by accident. It
is still possible for multiple users to --independently-- start a
fetchmail that queries the same host for the same remote user.

I agree that "the optimal solution" is very hard to implement. But the
current locking scheme has an annoying side-effect. For example, I run
a fetchmail daemon to fetch my email from a remote site. Occasionally,
I need to fetch some email from another pop host. To do this, I first
need to stop the background fetchmail, run the dedicated fetchmail,
and then restart the background fetchmail. This is annoying, since
fetching email from one system does not interfere with a fetch from
another host.

A lockfile per invoking user / hostname combination may be an
improvement. It doesn't make the situation worse than it is, but
will solve the abovementioned problems.


He's got a point.  I sat down and thought about this for a while, and
there are a couple of possible solutions to this problem.

SIMPLE WAY:

One is to scrap the client-side locking entirely and rely on the POP and
IMAP servers to properly serialize or lock out concurrent access to
the same mailbox by more than one session.

This would have the advantage of simplifying the code considerably.
All the lockfile stuff would just drop out.  The reason I put in
locking in the first place was to keep users from screwing themselves
with certain old, flaky POP3 servers that had no busy locking at all,
like the old UCSD pop3d implementation.  But that was five years ago;
it's a different world now, and even the worst POP3 implementations
still live aren't that bad any more (IMAP servers always handled this
better).

The big disadvantage of this path is that "fetchmail -q" would stop
working, and running "fetchmail" in foreground with another instance 
in background would no longer simply wake up the background instance. 

On one level this is an implementation problem -- I use the lockfile
as a rendezvous where foreground instances of fetchmail can find the
PID of the unique background instance.  On another level it's more
fundamental -- without locking, there is not necessarily a unique
background instance at all.  How would we know which one to signal?

COMPLEX WAY:

There is a solution to the -q problem, but it's hairy.  Each fetchmail
instance would have to post its PID to a common database; we'd
interpret `fetchmail' as "if there are background instances running,
wake them all up; otherwise background yourself", and `fetchmail -q'
as "terminate all other instances".  We could play cute games with
allowing a subset of running instances to be selected by the
command-line arguments in the same way that servers to be polled are
now.

If we did this, we'd get mailbox locking for free -- fetchmail
instances could register a lock in the database at the start of a
poll, and remove it at the end.

Implementing such a database is complicated by the fact that we can't
assume all relevant instances will be run under the same user ID.
Some might be daemons started by a sysadmin and running in root mode.
This makes using the filesystem for the database problematic -- all
instances would have to run suid to get read-write access to some 
shared database is system-land.

There is a way, however -- shared-memory segments (which have the
desirable properties that they automatically vanish on a reboot and
can be made to vanish automatically when the number of processes
referencing one goes to zero). In this design, all fetchmail instances
would post their PIDs, locks, and foreground/background status to a
common shared-memory segment.

The drawback of this approach is (a) complexity, and (b) not all
Unixes support the shared-memory primitives.  So -q would break on
some systems.

WHAT TO DO?

I'm tossing this out for discussion.  The three basic alternatives are:

(a) leave the present locking alone in order to keep -q working everywhere.

(b) scrap client-side locking entirely -- multiple concurrent
    instances would be possible, but -q would stop working.

(c) do the funky chicken with shared memory -- multiple concurrent
    instances would be possible, and -q would mostly work but might
    break on some platforms.

I don't expect this will go in before 5.8.0, anyway.
-- 
                <a href="http://www.tuxedo.org/~esr/";>Eric S. Raymond</a>

A right is not what someone gives you; it's what no one can take from you. 
        -- Ramsey Clark