ietf-822
[Top] [All Lists]

The dual meaning of c-t-e's

1994-10-21 22:11:50
Michael,

I empathize with your comments about the tangled meanings of the
content-transfer-encodings.  When I was diving into the MIME spec, it
took me a couple of ties before I could put the concepts in place.
Here's the explanation I give people who have strong backgrounds in
specification but no prior exposure to MIME.

The Content-Transfer-Encoding header provides two pieces of
information.  It specifies which of three transformations the body
was subjected to, and it specifies what the domain of the result is. 

The three transformations are identity, quoted-printable and base64.
The domains are binary, 8bit and 7bit.  The binary domain is the the
set of all octet strings.  The 8bit domain is the set of octet strings
which in which the distance between line separators is not more than
1000 bytes.  The 7bit domain is the subset of the 8bit domain in which
no octet has the high order bit turned on.  (This is the explanation
I've used previously.  It inescapably specifies that nulls are in the
7bit and 8bit domains.  I will modify my explanation if nulls are
outlawed.)

There logically nine possible combination of an encoding and resultant
domain, but it turns out that q-p and base64 always map their inputs
into the 7bit domain.  Therefore some of the combinations cannot
occur.  Here's a table that shows the possible combinations and the
value that's specified in the C-T-E header.


                Transformation

Resulting
Domain          Identity        Q-P     Base64

Binary          Binary          XXX     XXX

8bit            8bit            XXX     XXX

7bit            7bit            Q-P     Base64


This table can be read backwards to answer the question what does a a
particular C-T-E value tell you about the original body part.

C-T-E: binary means the body was not transformed and the body part is
an octet string.  (It might actually be siple ASCII text with short
lines, in which case a C-T-E valye of 7bit would also have been
correct.  Nothing in the MIME spec ensures the most generality
description of data.

C-T-E: 8bit means the body was not transformed and it happens that
lines are delimited at least frequently as required.  (I think this is
1,000 characters, but I don't know for sure.)

C-T-E: 7 bit means the body was not transformed and it happens that
all characters have their high order bit off and lines are delimited
at least frequently as required.  (I think this is 1,000 characters,
but I don't know for sure.)

C-T-E: quotaed-printable means the text has been transformed from it's
original format.  Nothing is implied about the domain of the original
text, but it's evident that Spain and some of our own people are
concerned about the stability of Holland.



Steve

<Prev in Thread] Current Thread [Next in Thread>