Eric A. Hall wrote:
Do you have a specific URL to a specific man page that you think would be
appropriate and authoritative?
I spent a while looking (see also http://makeashorterlink.com/?X61D16909
for example), and couldn't find anything that seemed to be authoritative
enough to reference.
The docs/formats.txt file in the UW IMAP distribution
ftp://ftp.cac.washington.edu/mail/imap.tar.Z
describes several variants, some of which are not addressed
by the Bernstein document. It also discusses a number of
interoperability issues not covered in the captioned draft.
As noted by Mark Crispin in the messages at the URL which you
have provided, different software can have somewhat different
interpretations of separator lines. The only hope of having
any semblance of interoperability in exchanging such mbox
files between systems is to include an adequate specification
of the *specific* separator format used in the *particular*
file being transported (and I have suggested a regular
expression as a possible means of communicating that specification
via a Content-Type field parameter).
And I agree with Kai Henningsen's statement about use of
application/octet-stream; if there is not provision for
sufficient information (e.g. via parameters) to account for
the interoperability issues, then there's no point in having
a separate application subtype for it -- just use octet-stream
and let the communicating parties thrash out the details
out-of-band. Or better yet (as briefly noted in the draft)
just use multipart/digest with local-system translations at
each end.
There are some additional issues that should be considered
(and either resolved or documented as known technical
omissions):
1. It is possible that processing by some transport (especially
gateway) software may alter content which is not protected
by transport encoding (and quoted-printable encoding might
be insufficient). That includes adding or removing trailing
whitespace, modification of lines beginning with "From ", etc.
2. It is at best unclear how various software which uses one of
the mbox format variants handles some 8bit and binary content,
and indeed even 7bit content in MIME messages. In particular,
any attempt to convert line endings to/from the on-the-wire
RFC [2]822 CRLF ending is likely to result in problems. For
an example of binary content, see
http://users.erols.com/blilly/mparse/tm35
Note that the issue is not limited to binary content; "line
ending" only has meaning in text media (not application, audio,
image, video, model) and attempts at transforming non-text
content (not merely text line endings, but e.g. content that
happens to have <CR><LF>From<SP> in application, audio, etc.
media) may irreversibly corrupt the content.
My opinion is that none of the mbox variants can handle the above
issues adequately. Multipart/digest avoids the message separator
problem by using separators which are required not to be present
within the body of the message text in question. Rigid formats
such as used by mbox variants are guaranteed to conflict with
some content at some point. Some of the mbox variants appear
to have resulted from attempts to deal with various aspects of
the above and similar issues; the local storage methods that
appear to have the best chance of avoiding all of the problems
are those that use message-per-file storage in RFC [2]822
format (i.e. no attempt to convert line endings or otherwise
fold, spindle or mutilate message content) -- of course those
aren't "mbox" formats.