Folks,
This is a note to try to summarize and reach closure on the specifics
of a compression mechanism for MIME. This is just a summary of the
past few years, so if you have kept up with the history, you may want
to skip it. The following design criteria for MIME should be kept in
mind.
o Mail user agent functionality is not negotiated. If mail arrives
with an unknown content type, unknown transfer encoding, unknown
compression, or unknown encryption, the message is lost. There is
no way to know in advance if a set of transformations is supported.
Maximum interoperability is attained when the number of options is
minimized and all agents are capable of interpreting all options.
o MIME was designed to have two levels, one level for the content-type
and one level for the encoding. This was a fundamental decision to
enable MIME to function in multiple transport environments. Given a
goal of facilitating MIME use over 8 bit and possible future binary
transport, compromise was reached on a two level model with as few
encodings as possible. Note that if all mail transport was 7 bit
ascii, the content-types could have the transfer encoding built in,
making MIME a simpler one level model.
A very strong case was made for THREE encodings, one optimized for
arbitrary binary, one for ascii-like text, and none (7bit, 8bit and
binary are all flavors of "none").
o Compression is another option which if introduced must be supported
by all to preserve interoperability. The goal as before should be to
have a few options as possible, understanding that every well
supported implementation should support all options to not lose mail.
o MIME has 3 built in expansion mechanisms for new core
functionality. New content-types, New transfer encodings, and New
values for previously defined parameters. Software written to RFC
1341 should be prepared to deal in a graceful way with unknown values
for these mechanisms. A fourth option for expansion, new content-
headers is possible, but only for optional functionality which does
not reduce or change the basic functionality of the content type.
(Content-disposition, content-description, content-MD5) Software is
expected to ignore unknown content- headers.
The following points were made in the past few days discussions.
o There may be more than one compression algorithm available. In
fact, a single compression algorithm is not likely to be effecient on
all content-types. Highly specialized compression algoritms are
tightly bound to the content-type and should be part of the
content-type definition. Note that audio/basic has built in
compression and that video will have complex built in compression.
A general purpose compression algorithm is very useful for text and
for arbitrary application/octet-stream data. While the
characteristics of both are highly variable, reasonable compression
can be achieved on the majority of general email traffic with a single
algorithm. The patern of earlier MIME design discussions and
decisions would lead one to believe that a loss of effeciency by the
choice of one algorithm is well worth the gains in interoperability.
If more than one algorithm is needed, then the number choosen should be
held small, they should be well specified, and they should be grouped
together at the same time for ease of transition.
o Adding a parameter to transfer encoding or adding a new header (both
options are semantically equivalent) to change the core behavior of
an implementation is not compatable with current MIME software.
The new information will be ignored and compressed data will be
presented to the user as though it was uncompressed. This is an
option which should be used only if no other is available.
o Compression and Transfer Encoding are semantically identical
operations. Both transform the data into a form more suitable for
transport (compatable with budget and underlying transport). Adding
a new compression algoritm produces the same interoperability problem
as a new content-transfer encoding. (It is hard or impossible to know
in advance if the new functionality is supported at the destination.)
In either case, you cannot access the data unless you have upgraded
your implementation.
o New content-transfer encodings are discouraged and require a formal
standards action to create. The same should apply to compression
algorithms. (Adding a compresison algorithm to MIME is a good enough
thing to warrent a new content-transfer encoding.)
Summary and Strawman Proposal
Define a single new content-transfer-encoding, compressed-base64.
This will provide compression for existing 7 bit transport. Should
binary transport be widely available in the future, and the need for
compression be demonstrated in this environment, a compressed-binary
may be defined later.
Greg Vaudreuil