nmh-workers
[Top] [All Lists]

Re: [Nmh-workers] Braindump: Extended MH Format

2004-12-10 13:25:13
pmaydell(_at_)chiark(_dot_)greenend(_dot_)org(_dot_)uk wrote:
While I'm not necessarily a fan of maildir, it does have some nice
locking semantics (and locking that works over NFS is a Hard Problem).

*nod*

If you're going to change format, it might be nice to have something
with a separate overview index.

Perhaps that's part of the key solution.  This type of meta-data is more
about helping the client be quicker at finding things.  I understand the
benefits of having standards so that all clients are on a level playing
field.  In any case, standardizing a way to create custom indexes that
don't rely upon the file name might be an useful step forward.
Obviously, you don't want to REQUIRE clients to update the file if it
works out better to have their own indexing mechanisms.  Sylpheed
doesn't rely upon .mh_sequences,GNUS' .overview files, or .xmhcache
files.

Note: my bad with regards to the non-existent .mh_context.  I meant to
refer to .mh_sequences. (I fixed this on my .plan file.)

If you re-sort the folder, the .mh_sequences file currently needs to be
updated anyway to reflect the change, right?  Well, rather than using
the index number to indicate members of a sequence, perhaps include the
actual file names and let the command line utilities enumerate the list.
The paradigm shifts from referencing the email name as the index itself
to referencing its enumerated index of the sequence.  We'ld have to give
mhpath a switch for specifying which sequence to use:

    $ mhpath -seq foo 1 +inbox
    /home/chewie/Mail/inbox/37062e01-f2ed-44ce-91fa-b2a8a04897b3

~/Mail/inbox/.mh_sequences might looks like this:

    all: 37062e01-f2ed-44ce-91fa-b2a8a04897b3 ...
    foo: 37062e01-f2ed-44ce-91fa-b2a8a04897b3 ...

Sorting itself could be specific to the sequence, which by default would
be "all":

    $ sort -seq foo ...
    
It might make sense at this point to allow separate sequence files by
label:

    $ ls ~/Mail/inbox/sequence.*
    sequence.all    sequence.foo

where the sequence files contain an ordered list of email identifiers
separated by newline, rather than having them all in one line.  Easier
to parse, relatively speaking.

You could track the current sequence in the "context" file, removing the
necessity of having to specify "-seq" with each command.  This might be
a nice feature to have in general.

(GNUS already has a format like this, which it calls nnml, which is
nmh with a .overview file containing some headers from each message. I
haven't actually looked at the implementation, though.)

Might be a good place to start.

It makes it harder to do things like 'grep foo ~/Mail/inbox/3???', of
course...

Building off what I have above, you could use sed to grab the offset
1000 identifiers from ~/Mail/inbox/sequence.all, and xargs over them.

    $ sed -ne '3000,3999p' ~/Mail/inbox/sequence.all | xargs -r grep foo

And since you need to lock the .mh_context to change virtually
anything in the folder you might as well just say "don't change
anything in the directory unless you have a lock on the .mh_context
file". Unless you can come up with a scheme like Maildir that lets you
go the whole hog and do things without holding an explicit lock, I
don't see the point.

Again, my bad since we're probably referring to .mh_sequence rather than
.mh_context.  Besides, how many IMAP servers do you know of that
currently care about .mh_sequences?  Even procmail doesn't bother,
creating hard links for messages saved to multiple folders at once.

In any case, I don't see a reason why a server or client MUST update
sequence files.  The data is there and parseable as individual email
files.  If we're no longer using filename for sort order, the server and
client won't loose track of internal tracking indexes for seen and
unseen might be building (not using .mh_sequences).  If we view the
sequence files as a convenience rather than a required mechanism for
updating state in a directory, the importance of gaining an immediate,
explicit lock on the meta-data files is reduced.  Using a unique
filenaming scheme would alleviate the locking issues for creating new
emails.  Problem reduced. ;-)

As a nitpick:
Renaming a file is simply creating a new link and then removing the
original link

maildir works precisely because renaming is *not* creating a new link
and removing the old one -- it has to be an atomic operation...

Right.  My bad.  Noted.

An interesting observation...  MUA's are still trying to catch up to
some of the features NMH sports, such as saved queries (ala sequences).
;-)

-- 
Chad Walstrom <chewie(_at_)wookimus(_dot_)net>           
http://www.wookimus.net/
           assert(expired(knowledge)); /* core dump */

Attachment: signature.asc
Description: Digital signature

_______________________________________________
Nmh-workers mailing list
Nmh-workers(_at_)nongnu(_dot_)org
http://lists.nongnu.org/mailman/listinfo/nmh-workers
<Prev in Thread] Current Thread [Next in Thread>