fetchmail-friends
[Top] [All Lists]

Re: [fetchmail]fetchmail fork?

2004-05-25 02:36:30
Brian Candler wrote:

On Mon, May 24, 2004 at 09:25:44PM +0200, Matthias Andree wrote:

This point doesn't seem to have been taken further by anyone; and indeed, if
all we're interested in is keeping our beloved and functional fetchmail
running with minimal work, maybe it's not relevant.

But if we *were* to rewrite fetchmail from scratch, I'd suggest a completely
new architecture.

Background: I recently came across 'unison' and now use it to synchronise
filesystems on my laptop and desktop machines. The way it works is to keep a
state file on each machine, listing the files within the managed directory
tree, their timestamps and checksums. When you run 'unison' it checks which
files have been created/modified/deleted on each machine, and propagates the
changes from one to the other. The only thing that needs manual resolution
is when the same file has been changed on both machines independently. I now
keep all my mail folders in Maildir format rather than mbox, and that
enables it to do two-way propagation of changes.

What I'd propose is something similar, but for mailboxes. It would consist
of a central core plus a number of plug-in protocol modules (pop3, imap,
smtp, read mailspool directly, exec MTA directly, etc).
Interestingly enough, I'm writing something very similar for Windows machines to replace Microsoft's horrible implementation of Roaming Profiles (which is basically a synchronized offline copy of user data, for the non-Windows admins out there). Only reason I mention it is that synchronization was a problem that interested me and I just finished coding a synchronization algorithm for Windows file data (comparing metadata, determing which to patch using librsync, which to copy, delete, etc).

Each module supports a few simple primitives:
- OPEN (connect to server, authenticate if required/requested)
- LIST (list all messages within mailbox, returning UID for each)
- FETCH (fetch a given message, and its envelope if available)
- DELETE
- PUT (store a message)
- CLOSE

The SMTP module would always return null for the LIST operation; it's a
PUT-only protocol. Similarly, POP3 would not support PUT.

Then the core module would implement a couple of basic functions:
- synchronise A and B (bi-directional update)
- copy new mail in A to B, optionally deleting the new mail from A
 (essentially the current fetchmail functionality: any deletions in A are
 not propagated to B, and any deletions/updates in B are not propagated
 back to A)

This is easily done by checking the list of UIDs on each side, and noting
which messages are new and which have gone, and propagating changes
appropriately. Once we've retrieved a message we can always calculate the
MD5 hash of the headers, so we can have an optional duplicate-removal system
in the core to prevent the same message propagating twice even if its UID
changes. For POP3, we would not support 'LAST' at all; anyone who wants to
leave-mail-on-server on a box which doesn't support UIDL would have to use
this, or else use POP3 in the way it was intended (i.e. read and delete).

You end up with a system which can emulate fetchmail (just tell it to
synchronise 'pop3' to 'smtp' for example), but which also can be used to
propagate new mail from one IMAP mailbox to another, or even perform two-way
replication of IMAP mailboxes or from IMAP to a local Maildir++ spool. That
way, you can keep an entire copy of your mailbox on your laptop, make local
updates, and dump the changes to your IMAP server the next time you connect.

Anyone interested in this as a concept?
Actually, I'm very interested in this concept. I was considering today what would be necessary to synchronize a copy of my inbox to a local copy on my home workstation from my server. Also, seeing as I still run Windows on my laptop, I was considering what would be necessary, if I was to move to Linux, to provide a protocol-independent offline operation for my IMAP (Exchange) mailbox.

Although you've described an architecture which would work well for fetchmail emulation, you've also described an architecture which could be generalized to support any sort of synchronization, given unique identifiers to qualify the information and proper metadata to determine changes. Not sure if generalizing it to that level interests you, maybe if it does we should take it off-list :).

However, I think it would be a very interesting project to emulate fetchmail functionality with a more generalized architecture that could be more easily expanded by adding seperate servers which would adhere to a specified protocol.

Personally I'd prefer if it were written in C, simply for portability
reaons. I don't like Python, and I'm loathe to be forced to install it just
to get fetchmail functionality. If I were writing it in a scripting language
I'd use Ruby, but then I'm sure other people are equally adverse to being
forced to install that as I am to Python.

Bits of fetchmail's C code could be recycled; socket.c in particular I find
a very useful reference on how to invoke OpenSSL.

Regards,

Brian.

I have to admit I'd think a scripting language, or at least something that was object oriented, would lead itself to an easier development model for a rewrite. I don't mind coding in C, but I find the additional overhead for memory management etc a waste of time in a project like this. Perhaps this would be a good candidate for Mono/C#? I've not coded yet in C#, but I've been looking for a project to play with it.

Clint