ietf-822
[Top] [All Lists]

Re: Archives

2004-11-02 08:52:54

On Sun October 31 2004 18:03, Keith Moore wrote:

I'm thinking that 
archives should be based on maildir at the server end,

It's not clear exactly what you mean by "maildir";

http://cr.yp.to/proto/maildir.html is as close to a specification
as I know of.  

but I'm not really thinking that maildirs should be standardized
as the archive format, so much as I'm thinking that it should
be possible to use maildirs as an archive format.  


the reason I've been thinking of maildirs is that it's simple to use,
it's robust, there are already IMAP and POP protocol servers that
support it, and it's not too  difficult to make maildirs accessible by
WebDav and FTP either. they are a tad awkward for WebDav and FTP access
(as there are reasons you want to support a hierarchy of folders
and maildirs use an extra layer of directories) but I don't think this
is onerous.

I'm assuming that archives are read-only, so archives stored in 
maildirs shouldn't result in files being renamed.  

though for the  
sake of efficiency I'm wondering about options to use mbox format 
(yeech) and compression,

It's not clear what efficiency would result; mbox format is a pain
to edit (e.g. to remove spam or other inappropriate messages that
might get into the archive)

there are lots of tools that edit mbox files, e.g. ucbmail.

and has the disadvantage of putting
all of the eggs in one rather vulnerable basket. It would also be
difficult to provide ftp*etc.) access to individual messages from
a flat mbox file.

yup.  the clients would have to download the entire mbox file.

the question is - what's the right balance of client work vs.
archive server work?  do we want to make things easy for archive
providers, easy for client writers, or somewhere in between?
being able to store lots of messages in a single file can save a
lot of disk space, particularly if you compress those mbox files.

there's a separate question of how you reference a particular
message within an mbox file.  maybe something like 
http://host/directory/path/file.mbox#message-id (yeech)

and also how to avoid having several years'  
worth of archives in a single directory or file while still making
the whole archive appear seamless to the client.

Subfolders would be one way to organize messages into groups
(e.g. by year and month), but I'm not convinced that it's
necessary (I've used Cyrus imapd with more than 30,000
messages in a folder with no problem).

in my experience most IMAP clients don't do a good job with folders
that large.  I'm thinking that archives should be able to support
hierarchal directory trees.

Using Cyrus imapd would natively provide IMAP, POP, and
NNTP access; since each message is stored as a file, ftp access
would be trivial to provide (simply provide a link between
ftp namespace and the IMAP folder).  Likewise for serving
native message format via http.  Using off-the-shelf products
and minimal effort, that would provide access to messages
via five distinct protocols (six URI schemes (news and nntp
schemes both use NNTP protocol)) from a common message
store.

yup.  and I want people to be able to use tools like that, without
actually being too specific to one particular tool.

Keith


<Prev in Thread] Current Thread [Next in Thread>