On Mon, 16 Aug 2004 22:47:52 EDT, Tony Hansen said:
The claim in Appendix A is that there were no authoritative sources of
documentation for the mbox formats and otherwise it's "only documented
in anecdotal form". I'm sorry, but the the definitions ARE there, and
ARE almost always authoritative for those systems.
Somehow, I can't get thrilled by the concept of saying a format is documented
because we have (for example) 3 systems, and each has an authoritative
definition of the version it uses, and the definitions are incompatible (and
yes, the Solaris 'content-length:' scheme and '>from ' escaping are basically
incompatible - there exist messages that can't be converted from one to the
other without information loss).
Because Solaris 8 is System Vr4-derived, you should look at 'man mail'
for the definitive definition. You'll find Content-Length: documented there.
A letter is composed of some header lines followed by a
blank line followed by the message content. The header lines
section of the letter consists of one or more UNIX post-
From sender date_and_time [remote from
followed by one or more standardized message header lines of
keyword-name: [printable text]
where keyword-name is comprised of any printable, non-
whitespace characters other than colon (`:'). A Content-
Length: header line, indicating the number of bytes in the
message content will always be present unless the letter
consists of only header lines with no message content.
For bonus points - is the 'crlf-crlf' between the header and the body included
in the Content-Length:? There's other issues as well - what if the
Content-Length: is computed across a non-canonified message - how do
you send it across the wire?
'man mail' doesn't mention escaping a 'From ' inside a message,
except for this:
The default mode for printing messages is to display only
those header lines of immediate interest. These include, but
are not limited to, the UNIX From and >From postmarks,
From:, Date:, Subject:, and Content-Length: header lines,
and any recipient header lines such as To:, Cc:, Bcc:, and
so forth. After the header lines have been displayed, mail
Of course, that's because Solaris doesn't use '>From ' escaping
because it has Content-Length instead.
Should other systems trust the value of a Content-Length:?
Should other systems be required to include a Content-Length?
Should other systems escape a 'From ' iff there's no Content-Length?
What if an mbox file has a Content-Length on some items but not others?
How do you recover from a corrupted Content-Length?
So - where is the *one true canonical* definition of an mbox that actually
answers all these basic questions that an implementer *needs* to know the
Description: PGP signature
Ietf mailing list