[Top] [All Lists]

CTE: Compressed-Base64 (Was: Massive Content-Type definition)

1993-06-10 08:41:21


This is a note to try to summarize and reach closure on the specifics
of a compression mechanism for MIME.  This is just a summary of the
past few years, so if you have kept up with the history, you may want
to skip it.  The following design criteria for MIME should be kept in

 o Mail user agent functionality is not negotiated.  If mail arrives
 with an unknown content type, unknown transfer encoding,  unknown
 compression, or unknown encryption, the message is lost.  There is
 no way to know in advance if a set of transformations is supported.
 Maximum interoperability is attained when the number of options is
 minimized and all agents are capable of interpreting all options.

 o MIME was designed to have two levels, one level for the content-type
 and one level for the encoding.  This was a fundamental decision to
 enable MIME to function in multiple transport environments.   Given a
 goal of facilitating MIME use over 8 bit and possible future binary
 transport, compromise was reached on a two level model with as few
 encodings as possible. Note that if all mail transport was 7 bit
 ascii, the content-types could have the transfer encoding built in,
 making MIME a simpler one level model.

 A very strong case was made for THREE encodings, one optimized for
 arbitrary binary, one for ascii-like text, and none (7bit, 8bit and
 binary are all flavors of "none").

 o Compression is another option which if introduced must be supported
 by all to preserve interoperability.  The goal as before should be to
 have a few options as possible, understanding that every well
 supported implementation should support all options to not lose mail.

 o MIME has 3 built in expansion mechanisms for new core
 functionality.  New content-types,  New transfer encodings, and New
 values for previously defined parameters.  Software written to RFC
 1341 should be prepared to deal in a graceful way with unknown values
 for these mechanisms.   A fourth option for expansion, new content-
 headers is possible, but only for optional functionality which does
 not reduce or change the basic functionality of the content type.
 (Content-disposition, content-description, content-MD5) Software is
 expected to ignore unknown content- headers.

The following points were made in the past few days discussions.

 o There may be more than one compression algorithm available.  In
 fact, a single compression algorithm is not likely to be effecient on
 all content-types.  Highly specialized compression algoritms are
 tightly bound to the content-type and should be part of the
 content-type definition.  Note that audio/basic has built in
 compression and that video will have complex built in compression.

 A general purpose compression algorithm is very useful for text and
 for arbitrary application/octet-stream data.  While the
 characteristics of both are highly variable, reasonable compression
 can be achieved on the majority of general email traffic with a single
 algorithm.  The patern of earlier MIME design discussions and
 decisions would lead one to believe that a loss of effeciency by the
 choice of one algorithm is well worth the gains in interoperability.
 If more than one algorithm is needed, then the number choosen should be 
 held small, they should be well specified, and they should be grouped
 together at the same time for ease of transition.

 o Adding a parameter to transfer encoding or adding a new header (both
 options are semantically equivalent) to change the core behavior of
 an implementation is not compatable with current  MIME software.  
 The new information will be ignored and compressed data will be 
 presented to the user as though it was uncompressed. This is an 
 option which should be used only if no other is available.

 o Compression and Transfer Encoding are semantically identical
 operations.  Both transform the data into a form more suitable for
 transport (compatable with budget and underlying transport).   Adding
 a new compression algoritm produces the same interoperability problem
 as a new content-transfer encoding.  (It is hard or impossible to know
 in advance if the new functionality is supported at the destination.)
 In either case, you cannot access the data unless you have upgraded
 your implementation.  

 o New content-transfer encodings are discouraged and require a formal
 standards action to create.  The same should apply to compression
 algorithms.  (Adding a compresison algorithm to MIME is a good enough
 thing to warrent a new content-transfer encoding.)

Summary and Strawman Proposal

 Define a single new content-transfer-encoding, compressed-base64.

 This will provide compression for existing 7 bit transport.  Should
 binary transport be widely available in the future, and the need for
 compression be demonstrated in this environment, a compressed-binary
 may be defined later.

Greg Vaudreuil