nmh-workers
[Top] [All Lists]

[Nmh-workers] Thoughts on IMAP

2012-02-07 11:37:05
There seem to be a few misconceptions about how IMAP works.  Let me try to 
explain in a bit more detail about how I think IMAP would work with MH.

First, and perhaps most important, IMAP support does not preclude keeping local 
copies of the message content in the native MH store.  MH would grow support to 
become a "disconnected mode" IMAP client.  In a nutshell, this means MH would 
keep a local cache of the content on the IMAP server.  But it would do so 
intelligently, taking full advantage of IMAP's ability to selectively address 
the individual MIME components that make up a message.  You could choose to 
cache the entire folder contents (offline mode), only certain MIME types (e.g. 
text/plain and application/pdf), or nothing at all (online mode), depending on 
your needs.  The key is to design in sufficient flexibility to allow people to 
craft a configuration that meets their needs.

Second, it's important that IMAP 'online' mode be supported as well as can be 
done without more than trivial changes the existing command line interface.  In 
fact, the entire exercise is pointless if this isn't done, as the alternative 
(suck down INBOX and delete from server) already exists in the form of 
fetchmail.  The reason I can't use MH today for my mainline production client 
is because I need access to my mail from literally dozens of machines and 
portable devices.  I bet a lot of the nay-sayers will start singing a very 
different tune when they discover they can have their full MH environment on 
more than one system.

But none of this can happen until the VFS abstraction layer is put in place, 
and that's an entirely separate project, with it's fingers in more than just 
the filesystem :-(  (Which should be discussed in a different thread, please.)

Now to open the first can of worms ...

I think the most disruptive change to come out of IMAP would be caching partial 
message content, and how that would be presented in the filesystem.  I think 
this is the crux of Roberts recent comment, and it's the biggest unsolved issue 
I've wrestled with since I first thought about doing this many years ago.  It 
did get me thinking about how I access the message files from external 
programs, and why.  In nearly every case it's because I want to parse the 
content of a text/plain section and act on it. E.g.:

* mailing list management: approve/deny a subscription request or submission to 
a moderated list.

* spam filtering: extract header info to update a blacklist (or whitelist).

* process system status reports: this is my biggest use of MH today. I have 
monitoring scripts that flag out-of-bounds conditions and send a report with a 
suggested action.  I have a wrapper that takes the messages I approve of and 
dispatches the appropriate response.

* smart attachment handling. I get periodic messages from services that contain 
attachments that need to be archived, but which contain useless metadata on the 
attachments themselves. The scripts parse the text part to glean out enough 
information to come up with, e.g., a suitable filename for the attachment, and 
then save it using that name.

* PGP decyption and verification.

The common pattern (other than PGP) is that I'm always dealing with a 
text/plain section, and invariably it's the first text/plain in the message 
body.  I can't remember now the last time I ran any commands of consequence 
across the raw message files, at least for anything in the message body. QP 
encoding of non-ASCII text makes grep a hit-and-miss proposition these days.  
And none of the standard UNIX commands are aware of MIME structure.

So in my specific case, my preference would be to configure my MH instance to 
keep a local cache of just the text/plain parts, and store them in a 
rationalized format, with all MIME encodings and charsets undone (i.e. 
converted to UTF-8).

Obviously this fails the current MH filestore model, in many ways.  But there's 
no way (that I can see) of implementing a partial cache that doesn't break that 
model.  The least disruptive scheme I can come up with would be to store the 
message headers and cached parts (undecoded) in a single file. But this fails 
horribly for anything that tries parsing the MIME body when the message 
contains nested body components.  But perhaps that's not a problem in the real 
world?

I'm interested in hearing how people grovel through the raw files in their MH 
stores.  How do you do MIME processing, or do you grok MIME at all?  What MIME 
types are of interest, and what do you do with them?  (This info is useful for 
more than just the IMAP cache discussion.)

And for general interest's sake, I encourage people to take a look at 
http://plan9.bell-labs.com/magic/man2html/4/upasfs. Upas takes the MH directory 
structure and extends it to support MIME in a very useful manner. (Upas talking 
to IMAP has been my other primary MUA for the past several years). The 
directory layout isn't practical on UNIX systems, but the scheme is brilliant 
in its simplicity.

--lyndon


_______________________________________________
Nmh-workers mailing list
Nmh-workers(_at_)nongnu(_dot_)org
https://lists.nongnu.org/mailman/listinfo/nmh-workers

<Prev in Thread] Current Thread [Next in Thread>